Google's Roboticists Turn to Table Tennis for Work on Human-Robot Interaction Learning

A table tennis testbed platform proves perfect for delivering key breakthroughs in human-robot interaction — and strong rallies.

The robotics team at Google's research arm has been busy playing table tennis on company time, but without fear of censure: The opponent at the other end of the table is a robot, being used to try out machine learning approaches for human-robot collaboration.

"There are two complementary properties of the table tennis task that make it interesting for robotic learning research," the researchers claim. "First, the task requires both speed and precision, which puts significant demands on a learning algorithm. At the same time, the problem is highly-structured (with a fixed, predictable environment) and naturally multi-agent (the robot can play with humans or another robot), making it a desirable testbed to investigate questions about human-robot interaction and reinforcement learning."

Google researchers have turned to table tennis to work on experiments with robotics learning for human interaction. (📹: Google Research)

Google's own robotic table tennis system has been used to drive two complementary projects: Iterative-Sim2Real and GoalsEye. In the former, a simulation-based approach to training robots for human interaction is designed to take the tedium out of the task; in the latter, behavior cloning is used to give the robot a precise goal-targeting policy — allowing it to return a ball to a particular area of the table.

"The central problem in learning accurate human behavior models for robotics is the following: if we do not have a good-enough robot policy to begin with, then we cannot collect high-quality data on how a person might interact with the robot. But without a human behavior model, we cannot obtain robot policies in the first place," the team explains.

"An alternative would be to train a robot policy directly in the real world, but this is often slow, cost-prohibitive, and poses safety-related challenges, which are further exacerbated when people are involved. [Iterative-Sim2Real] is a solution to this chicken and egg problem. It uses a simple model of human behavior as an approximate starting point and alternates between training in simulation and deploying in the real world. In each iteration, both the human behavior model and the policy are refined."

The GoalsEye system needs only a few hours of training to equal an amateur's skill level. (📹: Google Research)

GoalsEye, meanwhile, uses behavior cloning to teach the robot to target particular areas of the table without any simulation at all. "We found that the synthesis of two existing imitation learning techniques, Learning from Play (LFP) and Goal-Conditioned Supervised Learning (GCSL), scales to this setting," the team explains. "It is safe and sample efficient enough to train a policy on a physical robot which is as accurate as amateur humans at the task of returning balls to specific goals on the table."

The results are undeniably impressive: using Iterative-Sim2Real, the robotic ping-pong champ can manage 340-hit rallies against an amateur human opponent; under GoalsEye, the robot could reach equivalent or better skill level to a randomly-selected researcher within mere hours of physical training.

Iterative-Sim2Real is to be presented at the 2022 Conference on Robotic Learning (CoRL 2022) later this year, with a preprint available on Cornell's arXiv server under open-access terms; GoalsEye will be presented at the IEEE/RSJ International Conference on Intelligent Robots and Systems next week, with its own preprint also available on arXiv.

Gareth Halfacree
Freelance journalist, technical author, hacker, tinkerer, erstwhile sysadmin. For hire: freelance@halfacree.co.uk.
Latest articles
Sponsored articles
Related articles
Latest articles
Read more
Related articles