Pay Attention!

AttendSeg brings semantic image segmentation to the edge with clever tricks that reduce model complexity without sacrificing performance.

Nick Bild
3 years agoMachine Learning & AI
Semantic image segmentation (📷: X. Wen et al.)

The advances made in machine learning have been nothing short of amazing over the past ten years. Semantic segmentation, in which objects are classified on a per-pixel basis in an image, is one of the areas that has advanced very significantly. However, the best methods currently in use rely on deep neural networks that have heavy compute and energy requirements, which limit their applicability for many use cases.

Sometimes it is possible to prune and otherwise rearchitect a highly complex neural network such that complexity can be reduced without significantly changing the functionality. With this in mind, a team of machine learning researchers recently developed a new model, called AttendSeg, that tackles semantic segmentation on compute and memory limited devices. They achieve this goal through the use of attention condensers that enable highly efficient selective attention, and through machine-driven design exploration to construct an appropriate neural network architecture.

The lightweight attention condensers that compose AttendSeg’s self-attention network serve to improve spatial-channel selective attention while keeping network complexity very low. The architecture of the network is created with a machine-driven design exploration strategy tailored to resource-constrained devices. An algorithm employing generative synthesis solves a constrained optimization problem while maximizing a universal performance function. Through an iterative process, this machine-driven design exploration arrives at the final AttendSeg architectural layout.

The AttendSeg model balances representational power and efficiency with its mix of lightweight attention condensers, depthwise convolutions, and pointwise convolutions. Convolutions with large strides further reduce network complexity.

The model was evaluated against the Cambridge-driving Labeled Video Database. AttendSeg achieved similar performance in semantic segmentation when compared with ResNet-101 RefineNet and EdgeSegNet. Notable, however, is that AttendSeg has 72 times, and 5.9 times less parameters than these models, respectively. Moreover, memory requirements are 288 times and 23.6 times lower, respectively.

The team’s work on AttendSeg will have an immediate impact in semantic image segmentation for tinyML on low-cost, low-power edge devices. In the future, they plan to attempt using similar methodologies on the problems of object detection, instance segmentation, and depth estimation.

Nick Bild
R&D, creativity, and building the next big thing you never knew you wanted are my specialties.
Latest articles
Sponsored articles
Related articles
Latest articles
Read more
Related articles