"Attentive Normalization" Boosts Deep Neural Network Performance While Keeping Computation Down

Source code and pre-trained models for the process, combining two previously separate stages, are now available on GitHub.

A team of researchers at North Carolina State University have discovered a way to boost performance of deep neural networks without a corresponding increase to their computational load or power draw: Combining feature normalization and attention functions into a single process dubbed "attentive normalization."

"Feature normalization is a crucial element of training deep neural networks, and feature attention is equally important for helping networks highlight which features learned from raw data are most important for accomplishing a given task," explains Assistant Professor Tianfu Wu, corresponding author of the paper on the subject. "But they have mostly been treated separately. We found that combining them made them more efficient and effective."

To prove that combining the two boosts performance, the team turned to four common neural network architectures: ResNets, DenseNets, MobileNetsV2, and AOGNets, and compared their results to the ImageNet-1000 classification benchmark and the MS-COCO 2017 objected detection and instance segmentation benchmark.

“We found that AN [attentive normalization] improved performance for all four architectures on both benchmarks," notes Wu. "For example, top-1 accuracy in the ImageNet-1000 improved by between 0.5% and 2.7%. And Average Precision (AP) accuracy increased by up to 1.8% for bounding box and 2.2% for semantic mask in MS-COCO."

"Another advantage of AN is that it facilitates better transfer learning between different domains. For example, from image classification in ImageNet to object detection and semantic segmentation in MS-COCO. This is illustrated by the performance improvement in the MS-COCO benchmark, which was obtained by fine-tuning ImageNet-pretrained deep neural networks in MS-COCO, a common workflow in state-of-the-art computer vision."

As well as a paper on the project, presented at the European Conference on Computer Vision (ECCV), the team has released the source code and pre-trained models on GitHub under a mixture of the Apache License 2.0 and a "research only license."

Gareth Halfacree
Freelance journalist, technical author, hacker, tinkerer, erstwhile sysadmin. For hire: freelance@halfacree.co.uk.
Latest articles
Sponsored articles
Related articles
Latest articles
Read more
Related articles