Taking Matters Into Your Own Hands
Developed by NVIDIA and UCSD engineers, AnyTeleop is a general robot teleoperation system supporting different arms, hands, and use cases.
Teleoperation of robots refers to the process of controlling a robot remotely, which allows human operators to interact with and manipulate robots in real-time, bridging the gap between human decision-making capabilities and the physical capabilities of robots. Teleoperation is particularly useful in scenarios where direct human intervention is either unsafe or impossible, as is the case with hazardous environments, disaster response scenarios, deep-sea exploration, and space missions, to name a few.
While the specifics vary from application to application, the fundamental concept of teleoperation involves a human operator who uses a control interface, which could be a joystick, haptic device, or even virtual reality tools, to send commands to the robot. These commands are transmitted through a communication channel, such as Wi-Fi, 5G, or satellite links, to the robot's onboard system, guiding its movements, tasks, and actions. The robot may relay sensory feedback back to the operator, allowing for a sense of presence and situational awareness, crucial for making informed decisions during complex operations.
The applications of teleoperation are diverse and continue to expand as technology advances. In hazardous environments, such as nuclear power plants or chemical facilities, teleoperated robots can perform maintenance, inspections, and repairs without exposing human workers to dangerous conditions. In search and rescue missions, teleoperated robots can navigate disaster-stricken areas, locate survivors, and provide immediate assistance. Deep-sea exploration relies on teleoperated underwater vehicles to explore and map the ocean depths, revealing previously unexplored ecosystems and geological features.
Unfortunately, teleoperation systems are impractical for many applications due to a number of limitations. Each system is designed for a particular robotics platform and operating environment, for example. They are also generally tailored to single-operator and single-robot settings, and either real-world or computer simulated environments. These factors make adaptation to new platforms or use cases very complex and expensive. They also make it challenging to collect sufficient training data for hand tracking or human-robot retargeting models.
A vision-based teleoperation system recently described by engineers at NVIDIA and the University of California San Diego overcomes many of these obstacles and may lead to the development of more practical and versatile remote-controlled robots in the near future. Called AnyTeleop, their general teleoperation system was designed to work with different arms, hands, realities, and camera configurations. And that flexibility does not come at the expense of performance — in some cases AnyTeleop even beat existing teleoperation systems on the hardware platforms that they were designed for.
A number of advancements were needed to bring AnyTeleop to life. The first step involved creating a general and high-performance motion retargeting library that converts human activity into corresponding movements of a robot arm in real time. This can be adapted to different platforms by simply supplying a new kinematic model.
To avoid mishaps, a collision avoidance module was also developed by leveraging CUDA-based geometry queries. A web-based viewer that works with standard browsers was designed to seamlessly enable remote teleoperation across the internet. And finally, a general software interface was defined to decouple each module inside the teleoperation system.
To test their methods, an experiment was set up in which an existing teleoperation system called Telekinesis was paired with the robotic arm and hand it was designed for. 10 common tasks were performed by this setup, then the same tasks were performed by the same hardware, but using the generalized AnyTeleop approach. In eight of the 10 tasks, AnyTeleop was found to have achieved a higher rate of success. As for the remaining two tasks, it was a draw.
A few limitations were discovered during the testing of AnyTeleop. It was found that if the human operator moved their hands too quickly, the system would pause and initiate a re-detection process. It was also shown that hand pose estimation can be unreliable when the hand occludes portions of itself. The researchers suggest that the first issue can be solved by simply training the operator to not move too fast. As for hand pose estimation, they believe that adding more cameras could be the solution.
In the near future, NVIDIA plans to release an open-source version of the AnyTeleop system. Their hope is that this will help to facilitate further research in the field.