Jetson Gets a Rocket Boost: Hands-On with the NVIDIA Jetson Orin Nano Developer Kit

Now available, the Jetson Orin Nano Developer Kit comes with a headline-grabbing claim of 80 times the performance of the Jetson Nano.

It's been six months since NVIDIA's Jensen Huang took to the stage at the GPU Technology Conference (GTC) '22 and unveiled the high-performance low-power Jetson Orin Nano system-on-module (SOM), targeting artificial intelligence (AI) at the edge. Now, it's GTC '23 — and Huang is back on stage, this time launching the Jetson Orin Nano Developer Kit bundle with immediate availability.

Offering a claimed "80x" the performance of the Jetson Nano, the new Jetson Orin Nano is designed to bring the company's latest-generation Orin graphics processing unit (GPU) architecture to the entry level — but can it deliver on the company's lofty promises?

The Hardware

The Jetson Orin Nano SOM is actually available in two flavors, one being effectively half the performance of the other. The entry-level Jetson Orin Nano 4GB has, as the name implies, 4GB of LPDDR5 memory with 32GB/s bandwidth, a six-core Arm Cortex-A78AE CPU running at up to 1.5GHz, and an Orin GPU with 512 CUDA cores and 16 Tensor cores offering a claimed 20 tera-operations per second (TOPS) of INT8 compute. The Jetson Orin Nano 8GB, by contrast, doubles both the capacity and the bandwidth of the on-board memory as well as the number of CUDA cores and Tensor cores in the GPU, but leaves the CPU alone, delivering a claimed 40 TOPS.

For those looking to buy the Developer Kit, the difference is moot: it's only available with the Jetson Orin Nano 8GB pre-bundled. The carrier board, into which the SOM has been handily pre-installed, will technically accept either SOM or even one of the high-end Jetson Orin NX SOMs — but that's moot, too, as you can't buy the carrier board outside of the Developer Kit bundle.

The compact carrier board itself breaks out the most essential features of the Jetson Orin Nano SOM. There's a single DisplayPort 1.2 video output, four USB 3.2 Gen. 2 Type-A ports, a gigabit Ethernet port, and a USB Type-C port — not for power, which is handled via a bundled 45W PSU and a DC jack, but for debugging and USB Device operation.

To the left of the main ports are two 22-pin MIPI Camera Serial Interface (CSI) ports, which can be used in place of or in addition to USB cameras. To the right is a populated 40-pin general-purpose input/output (GPIO) header which includes UART, SPI, I2C, I2S, and pulse-width modulation (PWM) support. There's a PWM speed-controlled header for the bundled heatsink and fan, too, along with a 12-pin header for external button control — positioned somewhat awkwardly beneath the upper edge of the SOM.

Flip the board over, and you'll find its last expansion options: a microSD slot nearly invisibly hidden underneath the top edge of the SOM, an M.2 Key E slot pre-populated with an 802.11ac Wi-Fi module, and two M.2 Key M slots with four and two PCI Express (PCIe) Gen. 3 lanes respectively — but note that one is limited to short 2230-sized modules, with the other offering a mounting point for 2280 modules only.

  • CPU: 6-core Arm Cortex-A78AE v8.2 with 1.5MB L2 and 4MB L3 cache, up to 1.5GHz
  • GPU: NVIDIA Ampere with 1,024 CUDA Cores, 32 Tensor Cores (512/16 on the 4GB module), up to 625MHz
  • Accelerators: None
  • RAM: 8GB LPDDR5 68GB/s (4GB 34GB/s on the 4GB module)
  • Storage: microSD (on module), M.2 Key M 4-lane NVMe and M.2 Key M 2-lane NVMe (on baseboard)
  • USB: 4× USB 3.2 Gen. 2 Type-A, 1× USB 2.0 Type-C for Debug/Device Mode
  • Connectivity: Gigabit Ethernet, M.2 Key E 802.11ac 2.4/5GHz
  • Display Outputs: DisplayPort 1.2
  • Camera Inputs: 2× MIPI CSI-2
  • GPIO: 40-pin header (populated) with UART, SPI, I2C, I2S, PWM
  • Video Encode (H.264): Software only, up to 3× 1080p30
  • Video Decode (H.265/H.264): 1× 4k60, 2× 4k30, 5× 1080p60, or 11× 1080p30
  • Dimensions: 100×79×21mm (around 3.94×3.11×0.83") including carrier

Performance

NVIDIA's big headline grabbing claim is that the Jetson Orin Nano delivers "80x the AI performance" over the older and considerably cheaper Jetson Nano — and that's technically true, but with a big caveat. NVIDIA's calculations are based on raw compute using FP16 precision on the Jetson Nano but INT8 precision on the Jetson Orin Nano. Using FP32 precision on both devices, for a level playing field, the gain drops from 80x to a still-impressive 5.4x — though INT8-versus-FP16 is still a reasonable comparison to make, as support for INT8 precision will be a big reason to upgrade to the new device.

That's not to say there is no uplift over support for INT8, however — after all, that 5.4x FP32 gain has to come from somewhere. As well as a switch to a new and more powerful GPU architecture, the Jetson Orin Nano packs eight times the number of CUDA cores as the Jetson Nano plus 32 Tensor cores for good measure. The processor has also moved to a newer Arm Cortex architecture, boasts two additional cores running at a marginally faster clock speed, and there's double the memory with the move to LPDDR5 offering more than two and a half times the bandwidth.

In short, the Jetson Orin Nano is a beast — but there have been a few sacrifices made along the way. Like the Jetson Nano, there are no NVIDIA Deep Learning Accelerators (NVDLAs) or Programmable Vision Accelerators (PVAs) found in the higher-end NX and AGX models. Oddly, the Jetson Orin Nano also lacks the hardware video encoder, NVENC, of the Jetson Nano — a sacrifice made, NVIDIA tells us, in order to make the Orin architecture available closer to the entry level. As a result, the Jetson Orin Nano is incapable of encoding a video stream above 1080p30 in real-time, although its six powerful CPU cores do mean it can handle up to three streams simultaneously if required — compared to up to eight 1080p30 streams, or one 4k60 stream, on the otherwise much less powerful Jetson Nano.

The Jetson Orin Nano's real focus is, naturally, on on-device edge AI, and it's here the GPU-heavy SOM shines. Networks which were entirely unusable on the Jetson Nano, like NVIDIA's ActionRecognitionNet 3D and PeopleNet v2.5, suddenly become usable: ActionRecognitionNet 3D goes from one frame per second (FPS) on the Jetson Nano to 26 FPS on the Jetson Orin Nano, with the 2D variant going from 32 FPS to 368 FPS; BodyPoseNet goes from 3 FPS to 136 FPS; and PeopleNet v2.5 goes from two FPS to 116 FPS. A benchmark license plate recognition (LPR) network, meanwhile, was excluded from the above graph as its going from 47 FPS to over 1,000 FPS on the new hardware skewed the scaling.

That's anywhere from a 12x to a 59x performance gain — some from the raw performance of the hardware, some from its support for INT8 precision. Efficiency hasn't been lost, either. While the new hardware may require active cooling — supplied in the form of a surprisingly quiet heatsink and fan assembly pre-installed on the module — the Jetson Orin Nano 8GB is configurable in 15W full-performance and 7W reduced-power modes, up from the 5W and 10W modes of the Jetson Nano. Measured at the wall, with the carrier board connected to a Wi-Fi network, DisplayPort monitor, and wireless keyboard and mouse, that comes out at 4W idle and around 17W peak under load — well within the capabilities of the bundled 45W power supply.

Software

The Jetson Orin Nano is designed to run JetPack, NVIDIA's embedded software stack based on Ubuntu Linux and formerly known as Linux 4 Tegra (L4T). All testing of the review unit was carried out on a pre-release version of JetPack 5.1.1 with a few minor bugs — including one which limited available system memory to 6.3GB out of the physical 8GB, which the company assures us will be resolved in the very near future.

JetPack 5.1.1 is based on Ubuntu 20.04.5 Long-Term Support (LTS), and comes with CUDA 11.4, TensorRT 8.5, cuDNN 8.6, VPI 2.2, Vulkan 1.3, Nsight Systems 2022.5, and Nsight Graphics 2022.6. What it doesn't come with, even after all these years, is any way to train networks on-device. While it's possible to deploy even complex networks and run them on the Jetson Orin Nano with great performance, any training has to take place off-device — a big and ongoing drawback for a device which could easily have been an all-in-one affordable workstation for AI development and experimentation.

The typical way to solve the problem is with a desktop-class device featuring one of NVIDIA's high-power high-performance graphics cards. An alternative is to move to cloud compute, including NVIDIA's own GPU Cloud (NGC), renting access to others' hardware in order to avoid having to shell out on your own when it may just sit idle between training sessions. Either way, the Jetson Orin Nano — like all Jetson-family devices — is backed by an impressive software stack which includes the Train-Adapt-Optimize (TAO) Toolkit to speed training and the Omniverse Replicator for synthetic dataset generation. Those opting for running in the cloud on NGC can even access a host of pre-trained models, ready for deployment or customization.

As with other models in the Jetson range, the deeper you dive into development and testing the sooner you'll hit the limits of the microSD card storage. There's no eMMC support on the Jetson Orin Nano, but there are two M.2 Key M PCI Express (PCIe) slots on the underside of the carrier board — either or both of which will accept a high-speed Non-Volatile Memory Express (NVMe) solid-state drive for high-capacity storage. There's a third M.2 Key E slot, too, though this one comes pre-populated with the Wi-Fi module connected to a pair of PCB antennas at one edge of the carrier board.

Conclusion

The Jetson Orin Nano Developer Kit is most assuredly a worthy successor to the original Jetson Nano Developer Kit. Even looking at a completely level playing field, it's several times faster — and when you factor in INT8 support, it offers a performance gain which has to be seen to be believed. Only the loss of hardware video encode should give cause to pause, but unless you need more than 1080p30 or two simultaneous streams you'll likely be fine using software encoding.

That performance does come at a cost — quite literally. The Jetson Nano 4GB Developer Kit may have had a price hike, going from $99 at launch to $149 today, but it's still a lot cheaper than the $499 at which NVIDIA has launched the Jetson Orin Nano Developer Kit. It's easy to see where the price difference goes, given that major increase in performance, but it takes the Nano suffix from indicating a maker-friendly product with a near-impulse price point to a more considered purchasing decision for anyone except well-funded corporate research and development types.

There is, at least, a discount available for educators — bringing the price of the kit from $499 to $399. Even then it's likely many will opt for the cheaper Jetson Nano Developer Kit, despite the vast performance differential, as providing enough computational grunt to showcase the core concepts behind edge AI without breaking the bank. Those needing the very best in performance, meanwhile, will likely splash out on the more powerful Jetson AGX Orin Developer Kit instead.

If the budget stretches, the Jetson Orin Nano Developer Kit won't disappoint. For those who outgrow the hardware it's nice to have the option to drop a Jetson Orin NX module in your existing carrier board, too, although it'd be nicer still if NVIDIA offered a cost-reduced bundle with the Jetson Orin Nano 4GB SOM variant.

The Jetson Orin Nano Developer Kit is now available to order via the official Jetson Store at $499.

Gareth Halfacree
Freelance journalist, technical author, hacker, tinkerer, erstwhile sysadmin. For hire: freelance@halfacree.co.uk.
Latest articles
Sponsored articles
Related articles
Latest articles
Read more
Related articles