Intel Goes Gunning for NVIDIA's Gen AI Crown with Its Gaudi 3 Accelerator

Company claims its accelerator can halve training time, boost inference performance, and deliver better power efficiency than NVIDIA's H100.

Intel has revealed more details on its next-generation Gaudi 3 accelerator for artificial intelligence (AI) workloads — claiming that it is "projected" to beat rival NVIDIA's H100 in both performance and power efficiency, while promising availability to its original equipment manufacturer (OEM) partners in the second quarter of this year.

"In the ever-evolving landscape of the AI market, a significant gap persists in the current offerings. Feedback from our customers and the broader market underscores a desire for increased choice. Enterprises weigh considerations such as availability, scalability, performance, cost, and energy efficiency," claims Intel's Justin Hotard.

"Intel Gaudi 3 stands out as the Gen AI [Generative Artificial Intelligence] alternative presenting a compelling combination of price performance, system scalability, and time-to-value advantage."

Intel chief executive officer Pat Gelsinger unveiled the Gaudi 3 back in December last year, as part of what he described as a mission to "usher in the age of the AI PC." At the time, though, few details were available — other than a desire to go toe-to-toe with GPU-based accelerators from AMD and NVIDIA and to launch some time in 2024 as part of a "suite of AI accelerators."

Now, during the Intel Vision 2024 event, the company has offered some actual figures for the performance of the new accelerator. Compared to Gaudi 2, Intel claims, Gaudi 3 delivers a fourfold boost in compute performance in BF16 precision, a 1.5x increase in memory bandwidth, and a doubling of the network bandwidth.

Built on a 5nm process node, the accelerator includes 64 AI-custom and programmable Tensor Processor Cores (TPCs), eight Matrix Multiplication Engines (MMEs), and support for 128GB of HBMe2 memory on-board plus 96MB of additional static RAM (SRAM). For network connectivity, each accelerator includes 24 200-gigabit-Ethernet ports.

The Gaudi 3 accelerator will also be made available as a PCI Express add-in board in addition to the usual mezzanine card design, Intel has revealed, using a full-height form factor and drawing 600W of power — making it the go-to choice, the company says, for fine-tuning, inference, and retrieval-augmented generation (RAG) workloads.

Naturally, that puts it in direct competition with GPUs and accelerators for NVIDIA — and Intel claims the Gaudi 3 "is projected to deliver" a 50 per cent reducing in training time for the Llama2-7B and -13B and GPT-3 175B large language models (LLMs), a 50 per cent boost in inference performance, and 40 per cent better power efficiency compared to NVIDIA's H100 accelerator.

The new accelerator is being made available to Intel's OEM partners, including Dell, Hewlett Packard Enterprise, Lenovo, and Supermicro first, in the second quarter of this year; general availability will follow in the third quarter for the main accelerators and in the fourth quarter for the Gaudi 3 PCIe add-in-board variant. Pricing, however, has yet to be disclosed.

Gareth Halfacree
Freelance journalist, technical author, hacker, tinkerer, erstwhile sysadmin. For hire: freelance@halfacree.co.uk.
Latest articles
Sponsored articles
Related articles
Latest articles
Read more
Related articles