If at First You Don’t Succeed…
LAND is a new machine learning approach that teaches robots to learn from their mistakes.
Autonomous robots are becoming an ever more common sight in daily life. One thing all of these robots — from self-driving cars to sidewalk delivery robots — have in common is the need for a lot of training data. A good set of training data makes the difference between getting to the destination and getting lost.
Unfortunately, collecting and labeling data is a very time-consuming and expensive chore. A new technique developed by a team from the University of California, Berkeley may be able to take some of the pain out of the process, however. And it just might make models more accurate to boot.
The technique, called Learning to Navigate From Disengagements (LAND) takes the approach of learning what not to do to improve navigation. When training an autonomous robot, there is typically a human nearby that monitors the robot to disengage autonomous activity and return the robot to a good path when it gets off track. The insight from the Berkeley team was to use this disengagement as a direct learning signal rather than simply as information for debugging. Using only the robot’s onboard sensors, and the disengagement signal, they were able to train a model to predict future disengagements. This information was used to teach the robot to plan and execute actions that avoid future disengagements.
A test of the new method revealed that LAND, on average, traveled 6.5 times further before a disengagement than imitation learning approaches. Further, LAND travelled 43 times further than reinforcement learning approaches before a disengagement.
While LAND solves some problems, like eliminating the laborious task of labeling collected data, requiring a person to constantly monitor a robot during data collection is also laborious and costly. The team believes that exploring methods that enable the robot to know when to ask for human help may alleviate this problem. They also note that while LAND was presented as a standalone system, future improvements may be seen by combining LAND with existing learning methods.
R&D, creativity, and building the next big thing you never knew you wanted are my specialties.