Researchers from Cornell University have studied how computer vision systems respond to and detect images which have been flipped — raising questions about whether flipping images included in datasets could be introducing biases into the resulting artificial intelligence systems.
"The universe is not symmetrical. If you flip an image, there are differences," explains associate professor Noah Snavely, senior author of the study Visual Chirality presented at the 2020 Conference on Computer Vision and Pattern Recognition. "I’m intrigued by the discoveries you can make with new ways of gleaning information."
"How can we tell whether an image has been mirrored? While we understand the geometry of mirror reflections very well, less has been said about how it affects distributions of imagery at scale, despite widespread use for data augmentation in computer vision," Snavely and his team explain. "In this paper, we investigate how the statistics of visual data are changed by reflection. We refer to these changes as 'visual chirality,' after the concept of geometric chirality — the notion of objects that are distinct from their mirror image."
"Our analysis of visual chirality reveals surprising results, including low-level chiral signals pervading imagery stemming from image processing in cameras, to the ability to discover visual chirality in images of people and faces. Our work has implications for data augmentation, self-supervised learning, and image forensics."
The team's initial findings where that computer vision systems are adept at detecting flipped images: A simple deep learning algorithm was able to classify images as flipped or original with a 60-90 percent accuracy, depending on the training data used. The AI's decision-making was put under the microscope using a heat-map system, revealing that it was picking up on text, wrist watches, shirt collars, faces, and phones — the latter typically held in a subject's right hand.
Switching to portraiture, the researchers found that system was keying in on hair partings, eye gaze - with the majority of subjects, for reasons unknown, looking to the left when having their picture taken - and beards. “It’s a form of visual discovery," says Snavely. "If you can run machine learning at scale on millions and millions of images, maybe you can start to discover new facts about the world."
The team concluded that similar research could have an impact on the way machine earning models are trained, and even in how training datasets are put together — in particular the common trick of using mirroring to increase the number of images in the dataset, which could lead to biased systems incapable of achieving high accuracy. "This leads to an open question for the computer vision community, which is, when is it OK to do this flipping to augment your dataset, and when is it not OK," Snavely explains. "I’m hoping this will get people to think more about these questions and start to develop tools to understand how it’s biasing the algorithm."
The team's work has been published under open-access terms, along with source code, on GitHub.