Teaching a robot how to perform a specific task can be challenging. Take water, for example; pouring a glass of water requires more than just manual dexterity. Not all clear liquids are water, so it must be identified before pouring begins. How much water can the glass hold, and what if the glass already contains a certain amount of water, how much more is needed for it to be full? A team of engineers from Carnegie Mellon has mitigated those issues by using AI and image translation.
As the term applies, image translation algorithms use a collection of images to train AI to convert images from one style to another, such as transforming a photo into a Van Gough painting or creating an image of a horse to look like a zebra. This is known as contrastive learning for unpaired image-to-image translation. This method of image translation also works for clear liquids, which are difficult for robots to see as it reflects, refract and absorb light depending on the background.
To teach AI how to see different backgrounds through a glass of water, the team streamed YouTube videos behind a transparent glass full of water. This allowed the robot to pour water against varied backgrounds in the real world, regardless of where the robot was located. Using the contrastive learning method enabled the robot to pour the water until it reached a certain height, even when glasses of different shapes and volumes were introduced. The researchers are now looking to refine the learning method to expand its capabilities, including pouring water in different lighting conditions and pouring from different containers with varied volumes.