Recent advancements in generative AI, like Stable Diffusion and ChatGPT, are having a profound impact on a variety of industries. In the field of natural language understanding, generative AI is being used to develop new chatbots and virtual assistants that can understand and respond to human language more naturally. Content generation is benefiting from recent advances to create realistic images, text, and music. In creative artistry, the technology is being used to create new forms of art and design. And in the area of problem solving, generative techniques are being used to develop new algorithms that can solve complex problems more efficiently.
Initially, generative AI algorithms were massive and demanded an exorbitant amount of computational resources. These early models, while groundbreaking in their capabilities, came with significant operational challenges. Training and running them required supercomputers or specialized high-performance computing clusters, which were prohibitively expensive for all but the largest organizations and research institutions. The enormous costs associated with these computational demands limited access to the technology and hindered hobbyists and many researchers from experimenting with it.
However, over time, there has been a remarkable shift in this trend. Advances in AI research and engineering have led to the development of more efficient algorithms and model architectures. Additionally, optimization techniques and hardware innovations have played a crucial role in making generative AI more accessible.
Today, we are witnessing the emergence of powerful generative AI models that can run on smaller and less expensive hardware platforms. These models are designed to be more resource-efficient while maintaining or even surpassing the quality of earlier, larger models.
These recent developments have led NVIDIA to create the Jetson Generative AI Playground, which was designed to make running cutting-edge generative algorithms on edge computing hardware a simple exercise. With a few hundred dollars worth of hardware, the Generative AI Playground allows anyone to run LLM-based chatbots or image generation models locally. Maybe we can finally say goodbye to the old “we're experiencing exceptionally high demand” messages.
The tutorials target the Orin generation of NVIDIA Jetson development kits, so using one will make for smooth sailing. You might be able to get other Jetson boards to work as well, but may run into a few bumps along the way. At present, there are three tutorials available, text generation, image generation, and knowledge distillation. Text and image generation are fairly self-explanatory — these are the guides that will help you create your own chatbot or text-to-image pipeline. Knowledge distillation is a more general technique that serves to compress the knowledge contained in a large “teacher” model into a smaller “student” model that is better suited for running on edge hardware.
Even with a step-by-step tutorial, running a cutting-edge generative AI model on local hardware may sound like an exercise in frustration. Fear not, NVIDIA has already done most of the work for us by packaging up the software in Docker containers. As a result, one only needs to run a few commands to download the container and model (such as LLaMA), and perhaps optimize the operating environment. The model will then be available for local use, and there is even a slick web-based user interface available for both the chatbot and the image generator.