Researchers from Ritsumeikan University, Toyama Prefectural University, Osaka University, and the South China University of Technology have come up with a new machine-learning network, DPPFA-Net, which aims to help autonomous vehicles and robots spot smaller objects with better accuracy.
"Our study could facilitate a better understanding and adaptation of robots to their working environments, allowing a more precise perception of small targets," claims senior author Horoyuki Tomiyama, professor at Ritsumeikan University, of the team's work. "Such advancements will help improve the capabilities of robots in various applications."
DPPFA-Net, the researchers explain, combines three core modules: Memory-based Point-Pixel Fusion (MPPF), Deformable Point-Pixel Fusion (DPPF), and Semantic Alignment Evaluation (SAE). When fed 3D point-cloud data from LIDAR sensors and 2D image data from camera systems, it ensures accurate alignment between the two data types — hence its name, the Dynamic Point-Pixel Feature Alignment Network.
While combining the data types the team was able to reduce the difficulty in network learning and improve robustness against noise within the point cloud data with MPPF, perform feature fusion at a low computational complexity with DPPF, and avoid feature ambiguity through SAE. The result, as tested in the KITTI Vision Benchmark, is a network which, in Tomiyama's words, "reaches a new state-of-the-art," particularly for the detection of small objects.
The network has considerable potential for improving the safety of autonomous vehicles, but Tomiyama says it could go further — including improving general robotics systems and pre-labeling raw data for use in other deep-learning perception systems, avoiding the time and cost required for manual annotation.
The team's work has been published in the IEEE Internet of Things Journal, under closed-access terms.