This Is So Random

The MaxDiff RL algorithm helps robots rapidly learn complex tasks by introducing randomness into exploration, leading to better reliability.

Nick Bild
20 days agoRobotics
NoodleBot will be used to test the new algorithm in the real world (📷: Northwestern University)

Without question, robots have been instrumental in improving the efficiency of a number of industries in recent decades. But while these robots can work with precision and repeatability under the controlled conditions of, for example, a manufacturing environment, they struggle in the sort of unstructured environments that are found in our homes. When the environment is fixed, a robot can be explicitly programmed to perform a set of steps to complete its task. However, where the layout of the environment is unknown or frequently changing, the robot must figure out how to complete the task on its own via some type of learning algorithm.

These algorithms have come a long way and are very effective in a wide range of use cases. But when it comes to guiding embodied agents like a robot, they often fall flat. A major issue is that present algorithms largely assume that data points are independent, but as a robot interacts with its environment through space and time, this assumption does not hold. Furthermore, the physical laws of the world are challenging to understand. As such, attaining acceptable performance can require an unreasonable amount of training data. For reasons such as these, today’s robots struggle in unstructured environments, and are often quite unreliable.

A novel artificial intelligence algorithm has recently been proposed by researchers at Northwestern University. This algorithm, called Maximum Diffusion Reinforcement Learning (MaxDiff RL), was designed to make sure that robots gain a diverse set of experiences through exploration. It was demonstrated that MaxDiff RL can help robots to rapidly learn very complex skills. Moreover, after the learning process is complete, they tend to be capable of performing new tasks correctly on their very first attempt.

MaxDiff RL was evaluated in a simulated environment (📷: Northwestern Engineering)

The key to the MaxDiff RL algorithm is randomness. As robots explore their environment to collect training data, randomness is injected into the process. By using this more diverse set of experiences to learn from, robots acquire the skills needed to perform tasks more reliably, and are more able to deal with unexpected circumstances.

To date, MaxDiff RL has only been tested in simulated environments, and with very high quality data. When compared against state-of-the-art models in a variety of standard tests, the new algorithm was found to be more accurate and reliable. It was also demonstrated that MaxDiff RL was able to learn more quickly, thanks to the randomness in its training data.

These factors could make MaxDiff RL applicable to a number of important use cases where accuracy and speed are crucial, such as with self-driving cars, delivery drones, and household assistants. But first the team will have to prove that their algorithm works as well in the real world as it does in simulated environments. In the real world, MaxDiff RL will have to deal with more complex physics and imperfect sensor measurements. The team has created a physical robot named NoodleBot that they intend to use for real-world testing, so we should have a better understanding of the system’s full potential in the near future.

Nick Bild
R&D, creativity, and building the next big thing you never knew you wanted are my specialties.
Latest articles
Sponsored articles
Related articles
Latest articles
Read more
Related articles