MixChannel Boosts Multispectral Computer Vision Training — Even on a Dataset of Just Six Pictures

Designed specifically for satellite imagery, MixChannel gets impressive results from very limited data — trained on jut six images.

A team of researchers from the Skolkovo Institute of Science and Technology (Skoltech) have showcased a new way to train computer vision algorithms, boosting accuracy for very small data sets of often-patchy satellite imagery — and it can augment existing neural networks including DeepLab, U-Net, and U-Net++.

"While they are very powerful, neural networks demand a lot of training data to achieve top results. Unfortunately, in practical tasks, we usually don’t have enough data," Sergei Nesteruk, PhD student and co-author of the paper, explains of his team's work. "To overcome this issue, data scientists apply various techniques that artificially increase datasets. One of the most popular methods is called image augmentation. It transforms images to add variability."

While image augmentation works great for many tasks, including person recognition, it's hard to apply to other domains — like satellite imagery. "It is easy to use image augmentation for generic RGB images," explains Svetlana Illarionova, PhD student and fellow co-author. "But multispectral data is very complicated, and there was no efficient way to augment it. MixChannel is the novel augmentation technique designed to work specifically with multispectral data."

The new approach is named "MixChannel" for a very good reason: It takes channels from the original image and swaps them out for channels taken from other images covering the same area. "MixChannel takes the set of images of the exact location," the team explains, "chooses one as an anchor image, and with the predefined probability substitutes some channels of the anchor image with the matching channels from non-anchor images from the same set."

Tested on satellite imagery of the Arkhangelsk region, which is typically cloudy, the MixChannel augmentation system considerably boosted the performance of three neural networks - DeepLab, U-Net, and U-Net++ — despite having a data set comprised of just six acceptable images.

"The average gain over the baseline solution is 7.5 per cent from 0.696 F1-score to 0.77, while the average variance drops more than twice from 0.17 to 0.077," the team finds in the conclusion to the paper. "Further improvement was achieved by adding auxiliary heights data, giving the overall accuracy of 0.81. It proves that the proposed approach can be combined with other techniques to get the synergy effect."

The team's work has been published under open-access terms in the journal Remote Sensing.

Gareth Halfacree
Freelance journalist, technical author, hacker, tinkerer, erstwhile sysadmin. For hire: freelance@halfacree.co.uk.
Latest articles
Sponsored articles
Related articles
Latest articles
Read more
Related articles