NVIDIA Shrinks Blackwell Down to Deliver Big AI Performance Gains for Small Form Factor Workstations

New RTX PRO 4000 SFF and RTX PRO 2000, both based on the Blackwell architecture, are due to launch before the year's out.

NVIDIA has unveiled two new Blackwell-architecture graphics cards, the NVIDIA RTX PRO 4000 Blackwell SFF Edition and RTX PRO 2000 Blackwell — both targeting the acceleration of machine learning and artificial intelligence (ML and AI) workloads on smaller form factor systems — and in a much more reasonable 70W power envelope.

"Applications are becoming increasingly AI accelerated, and more users need AI performance, no matter the size or shape of their workstation," claims NVIDIA's Stacy Ozorio in support of the company's latest launch. "The RTX PRO 4000 SFF and RTX PRO 2000 feature fourth-generation RT [Ray Tracing] Cores and fifth-generation Tensor Cores with lower power in half the size of a traditional GPU. The new GPUs are designed to bring next-generation performance to a range of professional workflows, providing incredible speedups for engineering, design, content creation, AI and 3D visualization."

The low-profile cards, designed to slot into small form factor systems where full-size add-in boards are too tall, is based on NVIDIA's Blackwell architecture — the launch of which gave the company no small amount of trouble owing to a since-rectified design flaw that lowered the number of working chips manufacturing partner Taiwan Semiconductor (TSMC) could get from each silicon wafer.

Both cards are not only smaller than previous Blackwell boards, but considerably less power hungry with NVIDIA claiming a 70W maximum power consumption even under load — yet, the company says, the RTX PRO 4000 Blackwell SFF delivers "up to" 2.5 times the AI performance, 1.7 times the ray-tracing performance, and 1.5 times the bandwidth of its predecessor. The RTX PRO 2000 Blackwell, meanwhile, is said to offer big gains for on-device generative AI workloads with 1.4 times the performance for image generation and 2.3 times for text generation compared to its last-generation equivalent.

In actual specification terms, that boils down to: 8.960 CUDA cores, fifth-generation Tensor and fourth-generation RT cores, two ninth-gen NVENC encoders and two sixth-gen NVDEC decoders, 24GB of GDDR7 RAM with error correcting code (ECC) on a 192-bit interface with 432GB/s bandwidth, and a PCI Express Gen. 5 eight-lane connection to the host for the RTX PRO 4000 Blackwell SFF; and 4,352 CUDA cores with fifth-generator Tensor and fourth-generation RT cores, one ninth-gen NVENC encoder and one sixth-gen NVDEC decoder, 16GB of GDDR7 RAM with ECC on a 128-bit interface with 288GB/s bandwidth, and the same PCIe Gen. 5 eight-lane connection for the RTX PRO 2000 Blackwell. NVIDIA claims an "effective FP4 AI TOPS" of 545 tera-operations per second (TOPS) for sparse work on the RTX PRO 2000 Blackwell, but has not released an equivalent figure for the RTX PRO 4000 Blackwell SFF.

More information on the new add-in boards is available on the NVIDIA website on the RTX PRO 4000 Blackwell SFF and RTX PRO 2000 Blackwell product pages; the company has confirmed that both will launch "later this year" from partners including PNY, TD SYNNEX, BOXX, Dell, HP, and Lenovo, though had not yet publicly disclosed pricing at the time of writing.

Gareth Halfacree
Freelance journalist, technical author, hacker, tinkerer, erstwhile sysadmin. For hire: freelance@halfacree.co.uk.
Latest articles
Sponsored articles
Related articles
Latest articles
Read more
Related articles