Arm Unveils Lumex, a High-Performance Platform for a Future in Which "AI Is Our Oxygen"

Launching with a claimed fivefold performance boost for on-device artificial intelligence workloads, Lumex promises freedom from the cloud.

Arm has announced its next-generation mobile platform, Lumex, designed for a future in which on-device artificial intelligence (AI) "is our oxygen" — and promising a fivefold boost in on-device AI performance over the company's current-generation flagship intellectual property (IP).

"Lumex is a next-generation hardware platform built for flagship experiences, and it's AI ready with SME2 [delivering] up to five times the performance and three times the efficiency uplift for on-device AI, and it's got double digit IPC [Instructions Per Cycle] gains," Arm's James McNiven told us in a pre-launch briefing. "And with our next-gen Mali GPU, again, we're delivering double-digit performance improvements for both graphics and AI and a doubling in ray tracing performance. It's three nanometer ready, and we provide physical implementations and foundry collaboration to speed that partner time to market, and it's built to scale."

Lumex — which includes a flagship C1-Ultra, a smaller but similarly-performing sub-flagship C1-Premium, a high-efficiency C1-Pro, and the C1-Nano for wearables and other ultra-small-form-factor devices — targets a future that Arm's Chris Bergey sees clearly: "AI is our oxygen," he told us, "every breath counts. Relying on cloud to scale isn't sustainable. We need to evolve our computing to keep pace with the rapid growth of AI. We need robust AI compute platforms, and our job is to deliver high-performing CPUs and GPUs with that."

At the heart of a claimed fivefold performance uplift for artificial intelligence workloads is the company's proprietary Scalable Matrix Extension 2 (SME2), an extension to the Armv9.3 CPU IP. In real-world use cases, Bergey claims, this has been shown to deliver a 2.4x performance gain for text-to-speech models, a 40 percent reduction in on-device large language model (LLM) response time, and camera denoising at 4k30 or 1080p120 — on a single core and without the need for GPU offload or accelerator coprocessors, the company says.

That's not to say there isn't a GPU in the mix, though: the Lumex platform includes the Mali G1-Ultra graphisc processor, which Arm claims offers a doubling in ray tracing performance through the company's new Ray Tracing Unit 2 (RTU2) IP. For traditional graphics workloads, the gain is a smaller but still noticeable 20 percent — and the same applies to artificial intelligence workloads, if the performance of the SME2-enabled CPU hardware alone proves insufficient.

As always, Arm will be providing the IP blocks that make up the Lumex platform to customers under license — but McNiven also confirmed "production-ready physical implementations" for 3nm processes at multiple unnamed foundries, which licensees can use to "shorten their time to converge on achieving compelling frequency or power and area [results] and ensuring first-time silicon success" — and that early-access customers have already successfully taped out Lumex parts.

More information is available on the Arm website; as is usual, the company has not publicly announced pricing nor a timescale for when the first Lumex-powered devices, which are expected to run the gamut from PCs and tablets to smartphones and wearables, will hit shop shelves.

Gareth Halfacree
Freelance journalist, technical author, hacker, tinkerer, erstwhile sysadmin. For hire: freelance@halfacree.co.uk.
Latest articles
Sponsored articles
Related articles
Latest articles
Read more
Related articles