Object Recognition Is Child's Play

By picking up and playing with objects in a child-like way, this algorithm learns to recognize them more accurately than existing systems.

Nick Bild
2 years agoRobotics
The researchers with their robot that learns through interaction (📷: University of Texas at Dallas)

We have long dreamed of having robots that can make our bed, cook us breakfast, and do all manner of mundane tasks to free us up to do more fulfilling activities. That future still looks to be far away, but significant progress is being made in developing the capabilities that such a general-purpose robot would need. Chief among those capabilities is object recognition — it is impossible for a robot to carry out even the most basic of tasks if it does not know what it is looking at.

While recognizing objects is simple for humans, the variability that exists in the real world makes it exceedingly challenging for computer vision systems. Our eyes and brains effortlessly identify objects in a wide range of lighting conditions, angles, and contexts. However, teaching a robot to do the same is a highly challenging task. Computer vision algorithms have come a long way, employing machine learning techniques and vast datasets to improve their accuracy in object recognition. These systems have become quite good at identifying objects, yet they are still prone to fail if they encounter unexpected variations in color or shape, for example, that a human would not bat an eyelash at.

With the hope of building a more capable object recognition system, a team led by researchers at The University of Texas at Dallas have developed a technique that allows them to learn more like humans. Where a typical object recognition algorithm learns by being shown a very large number of example images, the team noted that children learn about the world very differently. Children learn what an object is by picking it up and playing with it. After interacting with just one, or a few, objects in this way, the child learns to recognize any future instance of it, even when those future instances are highly variable.

The key to their method involved building a very hands-on robot that interacts with objects for an extended period of time before attempting to recognize what they are. An object will be pushed around and poked at by a robotic arm about 15 or 20 times, with an RGB-D camera inspecting the results of each manipulation. This allows the system to collect a wealth of information about an item that will help to capture all of the relevant data points that are needed to recognize it in the future, and also distinguish it from similar, yet different, objects.

Similar methods have been attempted in the past, however, they typically only manipulate the objects a single time. Accordingly, those methods collected much less information and were not as successful.

To demonstrate the effectiveness of their approach, the team leveraged an existing, pre-trained object segmentation model called MSMFormer that they then fine-tuned with real-world images that they captured with their robots that perform long-term object manipulations. It was observed that the segmentation accuracy of the model was significantly improved by the fine-tuning. Notably, the improvement was seen with objects in the same domain as the fine-tuning data, and also when evaluating benchmark datasets for unseen objects. This indicates that the model generalized well, and could have applicability in a wide range of scenarios.

While it was not specifically addressed in the research, the additional manipulations of objects, and added processing time that comes along with the additional data that is collected, would be expected to make the data collection and training processes more time-consuming. However, given the improvements in performance, that may well be an acceptable trade-off.

Looking ahead, the researchers are working to improve their system, especially the planning and control capabilities. With enhancements such as these, they believe it will find numerous uses in real-world applications, like sorting recyclable materials.

Nick Bild
R&D, creativity, and building the next big thing you never knew you wanted are my specialties.
Latest articles
Sponsored articles
Related articles
Latest articles
Read more
Related articles