PhysicsGen Uses Generative AI to Turn a Handful of Demonstrations in Hours of Robot Training Data

Foundation model can tailor data for a particular design of robot, boosting its success rate at a range of manipulation tasks.

Researchers from the Massachusetts Institute of Technology (MIT) and the Robotics and AI Institute (RAI) have come up with a way to improve the way robots move and interact with objects — by using generative artificial intelligence to build hours of tailor-made training data from only a few manual demonstrations.

"We're creating robot-specific data without needing humans to re-record specialized demonstrations for each machine," explains lead author Lujie Yang of the team's work, dubbed PhysicsGen. "We're scaling up the data in an autonomous and efficient way, making task instructions useful to a wider range of machines."

Too many robots, not enough time? PhysicsGen promises to turn a handful of real-world demos into masses of customized training data. (📹: Yang et al)

The idea behind PhysicsGen will be one familiar to those who have been keeping up with the current state of generative artificial intelligence — but rather than using the technology to create text-, audio-, video-, or picture-like objects that stand in for human art, the team is using it to synthesize robotic training data from a handful of virtual examples. The data aren't just increased in quantity, either, but improved in quality: the model takes into account how a target robot is configured to ensure the example data it generates is applicable to how it can move.

First, a human user wearing a virtual reality getup manipulates objects that are twinned in a 3D physics simulation. The human movements are tracked then applied to the target robot's joints, before trajectory optimization is applid to find the most efficient way to complete a given task. These trajectories are then used to train real-world robots — boosting, in one experiment, the task success rate from 60 percent to 81 percent, despite being built atop just 24 human-driven demonstrations.

Human VR demos are tweaked to apply to a particular robot design, then expanded into optimized trajectories. (📹: Yang et al)

"We'd like to use PhysicsGen to teach a robot to pour water when it's only been trained to put away dishes, for example," Yang says of the technology's potential extensions. "Our pipeline doesn't just generate dynamically feasible motions for familiar tasks; it also has the potential of creating a diverse library of physical interactions that we believe can serve as building blocks for accomplishing entirely new tasks a human hasn't demonstrated."

The team's paper is available as an open-access PDF in the Proceedings of the Robotics: Science and Systems Conference; additional information is available on the project website, with code "coming soon" at the time of writing.

ghalfacree

Freelance journalist, technical author, hacker, tinkerer, erstwhile sysadmin. For hire: freelance@halfacree.co.uk.

Latest Articles