DeepArUco++ Delivers Better Fiducial Market Tracking in Low Light, with Synthetic Training
A trio of neural networks, trained on a synthetically-generated data set, cope better against low lighting and shadows.
Researchers from the University of Córdoba and the Maimonides Institute for Biomedical Research of Córdoba (IMIBIC) have put neural networks to work on improving machine vision in low-light conditions, delivering a method that lets robots better track fiducial makers: DeepArUco++.
"The use of neural networks in the model allows us to detect this type of marker in a more flexible way," first and co-corresponding author Rafael Berral-Soler explains of the team's work, which focuses on the identification and tracking of printed fiducial markets by computer vision-capable robots, "solving the problem of lighting for all phases of the detection and decoding process."
"There have been many attempts to, under situations of optimal lighting, increase speeds, for example," co-corresponding author Manuel J. Marín-Jiménez adds of the importance of the team's work, "but the problem of low lighting, or [the presence of] many shadows, had not been completely addressed to improve the process."
Fiducial markers are used in robotics to allow a robot to easily track both the position and orientation of an object by following a typically black-and-white printed target. It's more reliable and less computationally intensive than generic object recognition and tracking, but still suffers in less-than-ideal lighting conditions — with sensor noise and shadows masking the markers and making tracking less reliable.
The team's solution is DeepArUco++, a trio of neural network models chained together into a system for detecting markers, refining their corners, and decoding them even in non-ideal lighting conditions. Interestingly, the team trained the models on a partially-synthetic dataset — but, the researchers claim, the system delivers better than state-of-the-art performance in both real-world testing and when used on the datasets used to train its rivals.
The researchers have published their work in the journal Image and Vision Computing under closed-access terms, with an open-access preprint available on Cornell's arXiv server; demonstration source code and pretrained models are available on GitHub under the permissive MIT license, along with the code used to generate the synthetic training data.