PARP Pruning Approach Boosts Performance, Reduces Error Rate of Automatic Speech Recognition Models

Designed to reduce computationally-complex fine-tuning, this double-prune method can boost both performance and accuracy.

A team of researchers at the Massachusetts Institute of Technology (MIT), UC Santa Barbara, and the National Taiwan University have come up with a new way to reduce the size of speech recognition networks and improve their performance, without a loss in accuracy: Prune, Adjust, and Re-Prune, or PARP.

"[PARP] discovers and fine-tunes subnetworks for much better performance, while only requiring a single downstream ASR [Automatic Speech Recognition] fine-tuning run," the researchers explain of their work, which was brought to our attention by IEEE Spectrum. "PARP is inspired by our surprising observation that sub-networks pruned for pre-training tasks need merely a slight adjustment to achieve a sizeable performance boost in downstream ASR tasks."

The idea behind PARP is to take a pre-trained speech recognition model, run a pruning pass which simply sets weak links' strengths to zero rather than removing them from the model altogether, then runs a fine-tuning pass based on labeled data — cutting out the double-tuning approach used by other pruning methods like One-shot Magnitude Pruning (OMP).

The results are impressive: "On the 10min Librispeech split without LM decoding, PARP discovers sub-networks from wav2vec 2.0 with an absolute 10.9%/12.6% WER [Word Error Rate] decrease compared to the full model," the researchers write. "We further demonstrate the effectiveness of PARP via: cross-lingual pruning without any phone recognition degradation, the discovery of a multi-lingual sub-network for 10 spoken languages in one fine-tuning run, and its applicability to pre-trained BERT/XLNet for natural language tasks."

In other words: PARP requires less computational effort than OMP, and in some cases outputs a network which is not only smaller but demonstrably less error-prone than its un-pruned equivalents.

The team's work is to be presented at the Neural Information Processing Systems (NeurIPS) conference this month, and is available under open access terms on OpenReview.net. A source code repository linked to the paper has not yet been populated, however..

Gareth Halfacree
Freelance journalist, technical author, hacker, tinkerer, erstwhile sysadmin. For hire: freelance@halfacree.co.uk.
Latest articles
Sponsored articles
Related articles
Latest articles
Read more
Related articles