Meta's Llama 2 Large Language Model Is Heading Out of the Cloud and On-Device, Qualcomm Promises

Using the Qualcomm AI Stack, the new Llama 2 LLM will run directly on Snapdragon devices starting in 2024.

ghalfacree
over 2 years ago β€’ AI & Machine Learning

Facebook owner Meta has announced the release, under a free Community License Agreement, of its second-generation Llama 2 large-language model β€” and a partnership with Qualcomm to get the technology running on-device, starting with smartphones and PCs in 2024.

"We applaud Meta’s approach to open and responsible AI and are committed to driving innovation and reducing barriers-to-entry for developers of any size by bringing generative AI on-device," claims Qualcomm's Durga Malladi of the collaboration between the two companies. "To effectively scale generative AI into the mainstream, AI will need to run on both the cloud and devices at the edge, such as smartphones, laptops, vehicles, and IoT devices."

Meta's first Llama large-language model was leaked ahead of its official release, but Llama 2 is freely licensed from the outset β€” available at no cost under a Community License Agreement, the company says, for both research and commercial use. "We're including model weights and starting code for the pretrained model," a Meta spokesperson confirms, "and conversational fine-tuned versions too. We believe that openly sharing today's large language models will support the development of helpful and safer generative AI too. We look forward to seeing what the world builds with Llama 2."

Qualcomm is planning to put Meta's new Llama 2 LLM directly on-device by next year, the company has announced. (πŸ“·: Qualcomm)

At present, the bulk of Meta's partnerships on Llama 2 implementations β€” including "preferred partner" Microsoft, Amazon Web Services, Hugging Face, and others β€” offer cloud-based access, where inference takes place on a powerful remote server based on prompts sent from client devices. With Qualcomm, though, the plan is different: optimizing and scaling the technology, which effectively acts as a supercharged autocomplete for stringing together words into a plausible but not-always-truthful response to natural-language prompts, for use on-device.

Qualcomm's implementation of Llama 2 will target smartphones, PCs, and other devices launching in 2024 and using the company's Snapdragon system-on-chip family. Unlike rivals, then, it'll work with zero connectivity β€” even with the handset in airplane mode β€” and has the potential to offer improved privacy guarantees over sending queries to third parties.

Those looking to get an early start can experiment with on-device AI on Snapdragon platforms with Qualcomm AI Stack, the company has confirmed, while developers can request access to Llama 2 on Meta's website β€” by agreeing to the Llama 2 Community License Agreement, which offers a royalty-free limited license for reproduction, distribution, modification, and derivation.

ghalfacree

Freelance journalist, technical author, hacker, tinkerer, erstwhile sysadmin. For hire: freelance@halfacree.co.uk.

Latest Articles