In this tutorial we implement the ImageNet Classification with Deep Convolutional Neural Networks (http://www.cs.toronto.edu/~hinton/absps/imagenet.pdf) using DeepBelieveSDK in Erle-Brain 2, a trained and large, deep convolutional neural network to classify up to 1000 different classes. The neural network, has 60 million parameters, 650,000 neurons, consists of five convolutional layers and three fully-connected layers with a final 1000-way softmax.
This implementation achieved and error rate of 15.3%. Refer to the paper for more details about it.
The tutorial assumes that you have access to Erle-Brain 2 (camera included) flashed with one of our latest Debian OS releases. If you experience any difficulties, feel free to jump into our forum and ask around.
# The code assumes you've got an Erle-Brain 2 with an included camerasu; cd /root # get root privileges, password can be found in Erle- git clone http://github.com/erlerobot/deep-brain-2cd deep-brain-2sudo apt-get install -y mercurialhg clone https://bitbucket.org/eigen/eigenln -s /root/deep-brain-2/eigen /root/deep-brain-2/DeepBeliefSDK/eigencd /root/deep-brain-2/DeepBeliefSDK/sourcemake cleanmake GEMM=eigen TARGET=pi2
There's three ways of running the DeepBelieveSDK:
- Using image files
- Using an image URL
- Using the embedded camera
Let's analyze the code of the latest one:
#!/bin/bash # This script uses the ImageNet Classification with Deep Convolutional # Neural Networks (http://www.cs.toronto.edu/~hinton/absps/imagenet.pdf) implemented and trained at # https://github.com/jetpacapp/DeepBeliefSDK to classify objects using a multilayer deep neural network. # Input images are fetched from the camera included in Erle-Brain 2. # Launched the script from: /root/deep-brain-2 export LD_LIBRARY_PATH="/root/deep-brain-2/DeepBeliefSDK/source":$LD_LIBRARY_PATH #FLAGS="-w 640 -h 480 -n -q 5 " #FLAGS+=" --nopreview " sudo raspistill $FLAGS -o cam.jpg DeepBeliefSDK/source/jpcnn -i cam.jpg -n DeepBeliefSDK/networks/jetpac.ntwk -t -m s -d | grep ^0\. | sort -u | tail -3 sudo rm cam.jpg
The script simply fetches an image from the Erle-Brain 2 camera, feeds its values into the deep neural network using the input layer and triggers the forward propagation of the values until it reaches the output neuron layer where the values are fetched as probabilities. Out of these probabilities the best (most probable) three are shown like the following:
Classification took 3719 milliseconds0.201121 Egyptian cat0.213839 lynx0.305880 tabby
The classification takes about 4 seconds and it uses the computation available at Erle-Brain 2 (Raspberry Pi 2-based) with its cores running at 900 MHz.
Let's look at some examples:
Let's start by the bad classifications. This clearly is not a bell pepper and the Granny Smith classification, which tells that this objects is an apple, doesn't seem to have as much weight as the pepper. The neural classifier might have needed more training data on red apples.
Here we can see the algorithm doing a much better job but still not quite right.
This one is actually accurate!
Give it a try and let us know how it does. The code is available at https://github.com/erlerobot/deep-brain-2.