NVIDIA Proposes DoRA the Fine-Tuner as a LoRA Replacement for Better AI Models
New approach to low-rank adaptation fine-tuning delivers better results across large language, vision, and other generative models.
NVIDIA researchers have developed an alternative to Low-Rank Adaption (LoRA) for the fine-tuning of pre-trained machine learning and artificial intelligence (ML and AI) models, which they say offers higher performance: DoRA.
"DoRA consistently outperforms LoRA across a wide variety of large language model (LLM) and vision language model (VLM) tasks, such as common-sense reasoning (+3.7/+1.0 on Llama 7B/13B, +2.9 on Llama 2 7B, and +4.4 on Llama 3 8B), Multi-Turn (MT) Benchmark (+0.4/+0.3 for Llama/Llama 2 7B), image/video-text understanding (+0.9/+1.9 on VL-BART), and visual instruction tuning (+0.6 on LLaVA 7B)," NVIDIA's Min-Hung Chen claims of the company's research. "DoRA has also been demonstrated in other tasks, including compression-aware LLM and text-to-image generation."
As anyone who has jumped on the hype train and begun playing with LLMs and VLMs themselves will attest, one of the biggest problems is training β a process that requires a huge corpus of data and scads of power-hungry compute hardware to complete for large models. Retraining to tune the model is impractical, which is where post-training tuning comes in β with Low-Rank Adaptation (LoRA) popular for its ability to deliver good results without the computational cost of its less-efficient alternatives.
Weight-Decomposed Low-Rang Adaptation (DoRA), as the name implies, builds on the LoRA concept but with improvements to both its capacity and stability. By decomposing pretrained weights into magnitude and directional components, then fine-tuning both, DoRA delivers a rapid fine-tuning approach that outperforms LoRA across a range of model sizes and types β from text generation and visual language models to image generators.
"DoRA consistently outperforms LoRA across various fine-tuning tasks and model architectures," Chen claims. "Moreover, DoRA can be considered a costless replacement for LoRA, as its decomposed magnitude and direction components can be merged back into the pretrained weight after the training, ensuring that there is no extra inference overhead."
DoRA has been published to GitHub under the NVIDIA Source Code License, with more information on the project page; a preprint of the team's paper is available on Cornell's arXiv server under open-access terms.