RoboCat Pounces Into the World of Robotics

The dream of a general-purpose robot is a bit closer, thanks to the RoboCat model that easily picks up new tasks and never stops learning.

Nick Bild
2 years agoRobotics
RoboCat enables robotic arms to rapidly learn new tasks (📷: Google DeepMind)

General-purpose robots offer a number of advantages over robots built for one specific purpose, with versatility being at the forefront. These robots are designed to handle a wide array of tasks and can adapt to different environments and situations. This flexibility makes them more useful in a variety of settings, as they can perform multiple tasks without requiring significant reprogramming or other modifications.

Another undeniable advantage of general-purpose robots is their potential for cost savings. Instead of investing in multiple specialized robots for different tasks, organizations can employ a single robot to handle various functions. This can result in lower equipment costs, reduced maintenance expenses, and increased operational efficiency.

So why do we not have robots that are a jack of all trades? After all, is there anyone who would not rather have C-3PO at their beck and call than a robot that just vacuums the floor? Certainly the advantages of a general-purpose robot are clear, but the problem is that they are very challenging to build. Designing and training a learning algorithm that can deal with the complexities of real world environments, and perform any arbitrary task within them, has proven to be a goal that is a bit out of reach to date.

Inspired by advances in foundation models for vision and language, a team of researchers at Google DeepMind have developed a foundation model for operating robot arms. The model draws on knowledge obtained from a large, diverse, initial training dataset and can learn to perform new tasks given as little as 100 demonstrations. This framework, called RoboCat, also has the ability to improve over time by teaching itself.

RoboCat is a visual goal-conditioned decision transformer that was trained on video clips of hundreds of tasks being performed. This data was collected from a variety of real-world robot arm types, as well as from simulated environments. And the system continues to get better as it learns new tasks. The initial model had a success rate of about 36% on previously unseen tasks after being shown 500 demonstrations. But after learning more new tasks, that success rate more than doubled.

A very interesting feature of RoboCat is that it does not stop learning after processing its training data. After training on a new task completes, an agent is started up that practices the new skill about 10,000 times. Throughout the course of this practice, additional training data is collected. This generated data is then used in another round of training for the RoboCat model which allows it to self-improve without additional inputs.

The team put their methods to the test in a series of experiments. In one case, a model that had been trained to operate a robot arm with a two-pronged gripper was found to be capable of controlling a more complex arm with a three-fingered gripper in just a few hours time. In a similar time frame, the system was also demonstrated to have the capacity to learn to, for example, pick up the correct piece of fruit from a bowl or solve a shape-matching puzzle.

The work presented by Google DeepMind does not get us all the way to a general-purpose robot, but it is an important step towards that goal. The versatility and adaptability of these techniques push the ball forward significantly, and perhaps other groups will build on this work over time. As for this research group, they intend to explore ways to enable multi-modal task specification in the future. They are also considering how the incorporation of reinforcement learning principles might help to further improve RoboCat.

If you are disappointed that there are no actual cats involved in RoboCat, you might be interested in checking out the brilliant felines of CatGPT.

Nick Bild
R&D, creativity, and building the next big thing you never knew you wanted are my specialties.
Latest articles
Sponsored articles
Related articles
Latest articles
Read more
Related articles