DeepStream 7.0 Simplifies Vision AI
NVIDIA's DeepStream 7.0 optimizes vision AI for edge devices with new Python APIs, GStreamer interfaces, and automated parameter tuning.
Some of the most computationally expensive and complex machine learning algorithms involve computer vision. These algorithms require large amounts of data and significant computational power to analyze and interpret visual information from the world. Tasks such as object detection, image segmentation, and facial recognition rely on cutting-edge neural network architectures like convolutional neural networks and Transformer-based models, which can have many millions of parameters.
This presents us with a number of challenges as computer vision models are most needed on edge computing devices, which have relatively limited computational resources available to them. Furthermore, many applications require real-time processing of visual data to be useful. Needless to say, large, complex algorithms and small hardware platforms do not play well together under these constraints. As such, powerful toolkits are needed to help optimize computer vision models, and also to make them simpler to build and deploy so that they are more accessible.
NVIDIA’s DeepStream is one such toolkit. DeepStream has long been helping developers to build and deploy end-to-end vision AI pipelines. Through the use of off-the-shelf plugins, this tool has also made it easier to optimize these pipelines for edge computing hardware. The latest release of this tool, DeepStream 7.0, was just recently announced and it includes a number of features that could be very helpful to developers of vision AI applications. The updates broadly focus on simplifying development and optimization.
Python is a favorite programming language among AI developers. Sure, most of the underlying libraries are written in lower-level languages like C++ for speed, but having Python APIs to interact with those lower-level tools makes development much faster, and also makes the end product much easier to understand and explain. With the DeepStream 7.0 upgrade, many new Python APIs were introduced, which allows developers to handle everything from pre-processing to inferences and post-processing with Python code.
GStreamer is a very popular and powerful framework for building complex media processing workflows. There is also a very steep learning curve associated with this tool, which can make it frustrating for developers to work with. To help with this problem, DeepStream now includes a feature called DeepStream Service Maker. This gives users an abstraction of GStreamer that allows them to rapidly build up media processing pipelines with a greatly simplified interface. Using DeepStream Service Maker, hundreds of lines of code can be distilled into just a few that are much easier to implement and understand.
Those working on object tracking applications will be happy to see that Single-View 3D Tracking has also received some significant updates. Using this utility, developers can accurately track objects in a three-dimensional space with just a single camera. The updates promise a clearer, more defined representation of movement — even in the presence of occlusions.
Another notable addition is the inclusion of PipeTuner 1.0. Pipelines often have a large set of parameters that must be tuned for accurate operation in each use case. This traditionally required a lot of inside knowledge of each scenario from expert developers, making the tuning process time-consuming and expensive. But PipeTuner 1.0 automates the process of finding the optimal parameters, saving time and money, and also enhancing system accuracy.
For a more full description of these, and other, enhancements present in the 7.0 release of DeepStream, be sure to check out the official announcement. There is also a getting started guide available to assist developers in quickly coming up to speed.