MIT Reduces AI's Environmental Impact, Carbon Emissions with High-Efficiency Once-for-All Networks

Instead of training an individual network for each target device, an OFA network can cover a multitude of target systems.

The researchers’ system trains one large neural network comprising many pretrained subnetworks of different sizes that can be tailored to diverse hardware platforms without retraining. (📷: MIT)

A team of scientists at the Massachusetts Institute of Technology (MIT) have developed a technique for reducing the carbon footprint of neural networks used in artificial intelligence systems — following the publication of a report last year estimating the carbon emission of a particular neural network architecture at 626,000 pounds of carbon dioxide, or around five times the lifetime emissions of an internal combustion engine car.

"The aim is smaller, greener neural networks," explains Assistant Professor Song Han, of the Department of Electrical Engineering and Computer Science at MIT. "Searching efficient neural network architectures has until now had a huge carbon footprint. But we reduced that footprint by orders of magnitude with these new methods."

The technique: A "once-for-all network," which lowers the computational demand by training a single large neural network comprised of many pre-trained subnetworks — each of which can be tailored to a particular target platform without retraining. The result, the researchers claim, is a system which cuts carbon emissions down to around 1/1,300th the level of the current approach — while boosting inference performance by between 1.5 and 2.6 times.

The once-for-all network is based on AutoML (Automatic Machine Learning,) in which the network design is carried out automatically rather than by hand. While this allows for a neural network to be tailored to a particular hardware platform, it comes at a loss of efficiency as each individual network needs to be trained from scratch. "How do we train all those networks efficiently for such a broad spectrum of devices — from a $10 IoT device to a $600 smartphone," asks Han. "Given the diversity of IoT devices, the computation cost of neural architecture search will explode."

According to the researcher's work, carried out on MIT's Satori computing cluster, a single OFA network can include 10 quintillion architectural settings — likely covering, they claim, every platform that will ever need to be targeted — yet offers improved efficiency, performance, and accuracy. "That’s a breakthrough technology," Han boasts. "If we want to run powerful AI on consumer devices, we have to figure out how to shrink AI down to size."

The team's work has been published under open-access terms on, ahead of its presentation at the International Conference on Learning Representations.

Gareth Halfacree
Freelance journalist, technical author, hacker, tinkerer, erstwhile sysadmin. For hire:
Related articles
Sponsored articles
Related articles