Welcome, in this guide, we'll walk you through setting up a powerful development environment for our Ryzen AI-powered recycling classification project.
Step 1: Prepare Your SystemEnable AMD IPU/NPUCheck IPU/NPU Status:
- Open Device Manager from Windows Search.
- Expand System Devices.
- Look for AMD IPU Device.
- Check IPU/NPU Status:Open Device Manager from Windows Search.Expand System Devices.Look for AMD IPU Device.
Enable IPU/NPU in BIOS:
- Search for Advanced Startup and select Recovery Options.
- Click Restart Now.
- Navigate to Troubleshoot > Advanced options > UEFI Firmware Settings > Restart.
- Go to Advanced > CPU Configuration.
- Set IPU Control to Enabled.
- Save and exit BIOS.
- Enable IPU/NPU in BIOS:Search for Advanced Startup and select Recovery Options.Click Restart Now.Navigate to Troubleshoot > Advanced options > UEFI Firmware Settings > Restart.Go to Advanced > CPU Configuration.Set IPU Control to Enabled.Save and exit BIOS.
Install NPU Driver:
- Download the NPU driver here.
- Extract the zip file.
Open Command Prompt in admin mode and run:
.\amd_install_kipudrv.bat- Verify the installation in Device Manager under System Devices.
- Visual Studio 2019:Download and install from the official website.
- CMake (>= 3.26):Download from the CMake website.
- Python (>= 3.9):Download from the Python website.
- Anaconda/MinicondaDownload from the Anaconda website.
Anaconda/Miniconda:
- Download from the Anaconda website.
Add the following to the PATH variable:
path\to\anaconda3\
path\to\anaconda3\Scripts\
path\to\anaconda3\Lib\bin\Step 2: Install Ryzen AI Software- Download and Extract Ryzen AI SW Package:Download from this link.Extract the package.
Install Ryzen AI Software:
- Open Command Prompt in admin mode.
- Navigate to the extracted folder.
Run:
.\install.bat -env <env name>Activate Conda Environment:
conda activate <env name>Run Quick Test:
Navigate to the quick test directory and run quickest.py:
cd ryzen-ai-sw-1.1\quicktest
python quicktest.pyYou should see
[Vitis AI EP] No. of Operators : CPU 2 IPU 398 99.50%
[Vitis AI EP] No. of Subgraphs : CPU 1 IPU 1 Actually running on IPU 1
...
Test Passed
...Following these steps, you will have the AMD Ryzen AI engine set up on your Minisforum UM790 Pro, ready for your object detection projects. For more details, visit the official AMD Ryzen AI documentation.
Technical Deep Dive: Litter Detection AI with Faster R-CNN and RyzenAIThis article provides a detailed explanation of the litter detection AI project, focusing on the model architecture, key code components, and the integration with RyzenAI.
DatasetTACO (Trash Annotations in Context) dataset is a collection of images annotated with various types of waste items. It is designed to aid in the development of models for waste detection, which is crucial for recycling and environmental cleanup. It comprises over 60 classes. Because I need object detection I also need annotations.
How to DownloadVisit the GitHub Repository: Go to TACO Dataset GitHub.
Clone the Repository: Open a terminal and run:
git clone https://github.com/pedropro/TACO.gitNavigate to the Directory:
cd TACODownload the Data: Run the script to download images and annotations:
python download.ptVerify the Download: Check that the data directory contains images and annotations.json.
Mask R-CNN is an extension of Faster R-CNN that adds a branch for predicting segmentation masks on each Region of Interest (RoI), in parallel with the existing branch for classification and bounding box regression. We'll use the maskrcnn_resnet50_fpn model from torchvision and fine-tune it on the TACO dataset.
We start by importing necessary libraries and setting up an argument parser to configure training or testing parameters such as epochs, dataset path, batch size, learning rate, and data split. In order to run the python file use
!python recycling_resnet.py -train --num_epochs 10 --dataset_path /path/to/datasetCustom Dataset ClassWe create a custom dataset class for TACO using torch.utils.data.Dataset. This class handles loading images and their COCO-formatted annotations, including bounding boxes, labels, and masks.
Data Transformations
We define transformation classes for preprocessing images and annotations. These include converting images to tensors, resizing them, and applying random horizontal flips for data augmentation. These transformations are combined sequentially using a custom Compose class.
The training function initializes the model's weights, sets the model to training mode, and updates the model's parameters using an Adam optimizer. A learning rate scheduler adjusts the learning rate based on validation loss. During each epoch, the model processes training data, computes losses, and performs backpropagation. Validation loss is computed to monitor progress, and checkpoints are saved after each epoch.
Testing and VisualizationFor testing, the trained model is loaded and set to evaluation mode. It processes test data and visualizes predictions by overlaying detected bounding boxes and segmentation masks on the input images using matplotlib.
ConclusionBut unfortunantely I got RuntimeError: The size of tensor a (4) must match the size of tensor b (0) at non-singleton dimension 1 errors. I tried to solve it many many times and I couldn't solve. So I tried another way which is using yolo model.
Part:2 YOLO modelI tried different approach using yolo model. This code sets up and trains a YOLOv8 model for object detection using a custom-labelled TACO dataset. The YOLOv8 package is installed, and necessary checks are performed. The Roboflow API is used to download the TACO dataset, which includes preprocessing and augmentation steps. The dataset is split into training, validation, and testing sets, with specific augmentation parameters applied. The code creates a directory for datasets, installs the Roboflow library, and uses an API key to access and download the dataset. The data.yaml file is displayed to show class and dataset file paths. Training is initiated with the downloaded dataset, specifying 640x640 image size and 30 epochs. After training, various results such as confusion matrix, training results, and validation images are visualized. Finally, the best-trained model is validated, predictions are made on test images, and the model is exported to ONNX format for inference in a backend environment like Node.js.Use following command:
python YOLO_train.py- Now I can go to next phase quantizing.
First, ensure you have all the required libraries installed. These include argparse, torch, numpy, onnx, onnxruntime, vai_q_onnx, torchvision, and cv2. Define the paths for your script directory, models directory, and ONNX model path.
pip install argparse torch numpy onnx onnxruntime vai_q_onnx torchvision opencv-pythonTo run the quantize script, use the command line:
python quantize_yolov8.py --data_dir path/to/your/dataset --model your_model_nameReplace path/to/your/dataset with the actual path to your dataset and your_model_name with the name you want to give to your quantized model
The quantization process begins by defining a class RecyclingDataset that inherits from Dataset. This class handles loading images and corresponding labels from the dataset directory, and preprocesses the images using transformations like resizing and normalization. Labels, read from text files, contain class IDs and bounding box coordinates. Next, the prepare_dataset function is defined to load and preprocess the dataset, creating a DataLoader object that provides an iterable over the dataset. This function accepts the dataset directory, batch size, and a flag indicating if the dataset is for quantization, using a different subset of data for calibration if quantization is enabled. A class YOLOCalibrationDataReader is then created, extending CalibrationDataReader to iterate over the DataLoader and provide batches of images for model calibration during quantization.
The quantize_model function performs model quantization by loading the ONNX model, checking its validity, and using the vai_q_onnx library for quantization. It takes the model name and DataLoader for calibration data, saving the quantized model to a specified output path. To use the quantized model, a load_quantized_model function initializes an inference session with the model using onnxruntime for efficient CPU inference. An argument parser function get_argsis created using argparse to handle command-line arguments such as the model name and dataset directory. Finally, the main function coordinates the entire process by parsing command-line arguments, preparing the dataset, and performing model quantization, while handling any exceptions that occur and providing detailed error messages for debugging.
It does indeed work but uses lot of ram and crashes. But It creates following command succesfully.
Quantizing model...
Quantizing yolo...
[VAI_Q_ONNX_INFO]: Time information:
2024-08-01 08:49:52.030188
[VAI_Q_ONNX_INFO]: OS and CPU information:
system --- Windows
node --- Mert
release --- 10
version --- 10.0.22635
machine --- AMD64
processor --- AMD64 Family 25 Model 116 Stepping 1, AuthenticAMD
[VAI_Q_ONNX_INFO]: Tools version information:
python --- 3.9.18
onnx --- 1.16.1
onnxruntime --- 1.15.1
vai_q_onnx --- 1.16.0+69bc4f2
[VAI_Q_ONNX_INFO]: Quantized Configuration information:
model_input --- C:\Users\merte\OneDrive\Desktop\Resnet-v2\models\best.onnx
model_output --- C:\Users\merte\OneDrive\Desktop\Resnet-v2\models\yolo_recycling_detection.qdq.U8S8.onnx
calibration_data_reader --- <__main__.YOLOCalibrationDataReader object at 0x0000015E42792700>
quant_format --- QDQ
input_nodes --- []
output_nodes --- []
op_types_to_quantize --- []
random_data_reader_input_shape --- []
per_channel --- False
reduce_range --- False
activation_type --- QUInt8
weight_type --- QInt8
nodes_to_quantize --- []
nodes_to_exclude --- []
optimize_model --- True
use_external_data_format --- False
calibrate_method --- PowerOfTwoMethod.MinMSE
execution_providers --- ['CPUExecutionProvider']
enable_ipu_cnn --- True
enable_ipu_transformer --- False
debug_mode --- False
convert_fp16_to_fp32 --- False
convert_nchw_to_nhwc --- False
include_cle --- False
include_fast_ft --- False
extra_options --- {'ActivationSymmetric': True}
INFO:vai_q_onnx.quant_utils:The input ONNX model C:\Users\merte\OneDrive\Desktop\Resnet-v2\models\best.onnx can create InferenceSession successfully
INFO:vai_q_onnx.optimize:Found Split node /model.8/Split. Replacing with Slice.
INFO:vai_q_onnx.optimize:Found Split node /model.12/Split. Replacing with Slice.
INFO:vai_q_onnx.optimize:Found Split node /model.15/Split. Replacing with Slice.
INFO:vai_q_onnx.optimize:Found Split node /model.18/Split. Replacing with Slice.
INFO:vai_q_onnx.optimize:Found Split node /model.21/Split. Replacing with Slice.
INFO:vai_q_onnx.optimize:Found Split node /model.22/Split. Replacing with Slice.
INFO:vai_q_onnx.optimize:Found Split node /model.22/Split_1. Replacing with Slice.
INFO:vai_q_onnx.quantize:Start calibration...
INFO:vai_q_onnx.quantize:Start collecting data, runtime depends on your model size and the number of calibration dataset.Finally for live camera use following command:(If you don't have webcam you can use DroidCam)
python live_camera_recycling.pyFuture Improvements1. Implement more advanced data augmentation techniques (rotations, color jittering, etc.).
2. Experiment with different backbone networks (e.g., EfficientNet, ResNeXt).
3. Implement feature pyramid network (FPN) for better handling of objects at different scales.
4. Explore other object detection architectures like YOLO or SSD for comparison.






Comments