This project consists of 5 steps
Step 1: Preparation
Step 2: Data acquisition and labelling
Step 3: Training and building model using FOMO Object Detection
Step 4: Deploy the trained model and test it on the Raspberry Pi

Published December 21, 2022 © GPL3+

AR Pong Game with Object Detection

The ML model uses Edge Impulse's FOMO (Faster Objects, More Objects) to detect and differentiate which players are playing & the coordinates

IntermediateFull instructions provided8 hours533

Things used in this project

Hardware components

Raspberry Pi 4 Model B

Webcam, Logitech® HD Pro

Logitech C270 or other USB Webcam

Animal plush toys

Or whatever objects to detect

Camera tripod

Software apps and online services

Edge Impulse Studio

Raspberry Pi Raspbian

I use Debian version 10 (buster)

Your favorite IDE

eg. Sublime Text

Terminal App

Story

This project uses Edge Impulse’s FOMO (Faster Objects, More Objects) object detection algorithm. The object detection ML model build and performed by selecting the grayscale Image block and FOMO Object detection with 2 or more output classes (e.g. armadillo and bee). This project takes advantage of FOMO’s fast algorithm (2ms in this project) to detect multiple objects coordinates while using a single board Linux-based computer such as the Raspberry Pi with USB webcam in this project.

The ML model is also embedded into our Python Pong game so that it can detect which players and their real time positions. We make this dynamic coordinates input as y coordinates (pong bat) on the 8x8 RGB LED matrix. In our testing sometimes there's still minor inaccuracy in the type of object. But I am confident with more data variation acquired, this idea can be developed further for more complex AR game and animation applications.

This project consists of 5 steps:

1. Preparation

2. Data acquisition and labelling

3. Training and building model using FOMO Object Detection

4. Deploy and test object detection on the Raspberry Pi

5. Build Interactive Pong game in Python

Step 1: Preparation

Prepare your Raspberry Pi with the updated Raspberry Pi OS (Buster or Bullseye). Then open your Terminal app and ssh to your Pi. Install all dependencies and Edge Impulse for Linux CLI by following the guide here

Take pictures of the objects from above (e.g. armadillo, bee, turtle, duck or other plush toys) in different positions with backgrounds of varying angles lighting condition to ensure that the model can work under different conditions (to prevent overfitting). In this project we use a smartphone camera to capture the images for data collection for ease of use.

Note: Try to keep the size of objects similar in size in the pictures, significant difference in object size will confuse the FOMO algorithm.

As you might already know, this project uses Edge Impulse as the Machine Learning platform, so we need to login (create an account first) — go to Edge Impulse and create a new project.

Step 2: Data acquisition and labelling

Choose Images project option, then Classify Multiple Objects

1 / 2

In Dashboard > Project Info, choose Bounding Boxes for labelling method and Raspberry Pi 4 for latency calculations.

Then in Data acquisition, click on Upload Data tab, choose your files, auto split, then click Begin upload.

Now, it’s time for labelling. Click on Labelling queue tab then start drag a box around an object and label it (arma or bee) and Save. Repeat.. until all images labelled. Make sure that the ratio between Training and Test data is ideal, around 80/20.

Step 3: Training and building model using FOMO Object Detection

Once you have dataset ready, go to Create Impulse and set 96 x 96 as image width - height (this help in keeping the model small in memory size). Then choose Fit shortest axis, and choose Image and Object Detection as learning blocks.

Go to Image parameter section, select color depth as Grayscale then press Save parameters.

Finally, click on Generate features button, you should get a result just like the one below.

Then, navigate to Object Detection section, and leave training setting for Neural Network as it is — in our case is quite balanced pre-trained model, then we choose FOMO (MobileNet V2 0.35). Train the model by press the Start training.. and you can see the progress

If everything is OK, you should see something like this:

After that we can test the model, go to Model testing section and click classify all. If the accuracy result is more than 80%, then we can move on to the next step — deployment.

(If accuracy result is not as good as expected, re-start with quality datas/photos, correct labels, or just change Learning rates setting)

Step 4: Deploy the trained model and test it on the Raspberry Pi

Now, we can switch to Raspberry Pi. Make sure your Pi has installed all dependencies and Edge Impulse for Linux CLI (as in Step 1) and connect your USB webcam.

Via terminal ssh and type:

$ edge-impulse-linux-runner

# add --clean (if you have more than one projects)

During this process you will be asked to log in to your Edge Impulse account.

This will automatically download and compile your model to your Pi, and start classifying. The result will be shown in the Terminal window.

You can also launch the video stream on your browser. Type: http:// YOUR Raspberry Pi IP ADDRESS:4912

Then you can see how this live classification works:

Now, the objects (arma and bee) have been successfully identified with x, y coordinates in real-time (Wow! very short time per inference—> up to 2ms).

Until this step, we’ve taken out data and trained an object detection model in Edge Impulse platform and running that model locally on our Raspberry Pi board. So, it can be concluded that it was successfully deployed.

Step 5: Build Python pong game

Final step...

We took the idea and modified from a simple classic Pong Game. With the Pong Python code, the library from Sense HAT and combining it with the classify sample code from Edge Impulse, it becomes a simple but interesting AR game.

For more detail, you can check our Python files <pong_1_objects.py> and <pong_2_objects.py2> in Code Attachment section below.

Because we use Python, so we need to install the Python 3 Edge Impulse SDK and clone the repository from the previous Edge Impulse examples. Follow the steps here.

You also need to download the trained model file so it is accessible by the program we are running.

Type this to download it:

$ edge-impulse-linux-runner --download modelfile.eim

Make sure that your/our program <pong_1_objects.py> and <pong_2_objects.py> is placed in the correct directory, or just put in /home/pi directory.

Now, play the game! Power on Pi with camera and start this Python code with eim model file embedded, then move our toy to slide bat to bounce the ball in LED matrix display.

Run the program with embedded model file using this command:

$ python3 pong_1_objects.py ~/modelfile.eim

And for even more fun, try the 2 players Pong game:

$ python3 pong_2_objects.py ~/modelfile.eim

This Pong game can detect which players are playing.

Finally, we have successfully implemented Edge Impulse FOMO object detection model and run interactive Pong game in Raspberry Pi. With the speed and accuracy that we obtained, we are confident that this project can also be develop in more complex AR game or animation applications

Feel free to leave a comment and thank you!

Code

#!/usr/bin/env python

#import device_patches       # Device specific patches for Jetson Nano (needs to be before importing cv2)

import cv2
import os
import sys, getopt
import signal
import time
from edge_impulse_linux.image import ImageImpulseRunner
from sense_hat import SenseHat
from time import sleep

sense = SenseHat()

# class and objects to show the players

class Player:
	def __init__(self, colours, picture):
		self.picture = [[colours[ord(id) - 48] for id in row] for row in picture]
		self.height = len(self.picture)
		self.width = len(self.picture[0])
		self.y_move = self.height - 8
		self.x_move = self.width - 8
		self.frames = (self.y_move + self.x_move) * 2
	
	def draw(self, frame):
		frame %= self.frames
		xShift = frame if frame < self.x_move else\
			self.x_move if frame < self.x_move + self.y_move else\
			self.x_move * 2 + self.y_move - frame if frame < self.x_move * 2 + self.y_move else\
			0
		yShift = 0 if frame < self.x_move else\
			frame - self.x_move if frame < self.x_move + self.y_move else\
			self.y_move if frame < self.x_move * 2 + self.y_move else\
			self.x_move * 2 + self.y_move * 2 - frame
		to_set = []
		for i in range(yShift, yShift + 8):
			to_set += self.picture[i][xShift:xShift + 8]
		sense.set_pixels(to_set)
	
	def animate(self, duration):
		frame = 0
		duration += time.time()
		while time.time() < duration:
			self.draw(frame)
			frame += 1
			sleep(min(duration - time.time(), .15))

	def mid(self, x_, y_):
		to_set = []
		for i in range(y_, y_ + 8):
			to_set += self.picture[i][x_:x_ + 8]
		sense.set_pixels(to_set)

bee = Player([
	(200, 200, 0), # 0 = yellow
	(0, 0, 200), # 1 = blue
	(200, 200, 200), # 2 = white
	(0, 0, 0)], # 3 = black
	["11133333111",
	"11322322311",
	"11132232311",
	"11113333311",
	"11133030031",
	"11303030003",
	"33303030303",
	"11303030003",
	"11133030031",
	"11113333311",
	"11111313111"])

arma = Player([
	(0, 0, 200), # 0 = blue
	(200, 0, 0), # 1 = red
	(200, 200, 200), # 2 = white
	(200, 140, 0), # 3 = orange
	(0, 0, 0), # 4 = black
	(60, 40, 20)], # 5 = chocolate
	["55555555535",
	"55555555335",
	"55555553133",
	"55555503113",
	"55555003130",
	"55500033333",
	"50003332233",
	"11333324223",
	"13333332233",
	"55333333333",
	"55554433335"])

runner = None
# if you don't want to see a camera preview, set this to False
show_camera = True
if (sys.platform == 'linux' and not os.environ.get('DISPLAY')):
	show_camera = False

def now():
	return round(time.time() * 1000)

def get_webcams():
	port_ids = []
	for port in range(5):
		print("Looking for a camera in port %s:" %port)
		camera = cv2.VideoCapture(port)
		if camera.isOpened():
			ret = camera.read()[0]
			if ret:
				backendName = camera.getBackendName()
				w = camera.get(3)
				h = camera.get(4)
				print("Camera %s (%s x %s) found in port %s " %(backendName,h,w, port))
				port_ids.append(port)
			camera.release()
	return port_ids

def sigint_handler(sig, frame):
	print('Interrupted')
	if (runner):
		runner.stop()
	sys.exit(0)

signal.signal(signal.SIGINT, sigint_handler)

def help():
	print('python classify.py <path_to_model.eim> <Camera port ID, only required when more than 1 camera is present>')

ball_position=[6, 3]
ball_velocity=[-1, -1]
player_position = 4

def draw_bat(col):
	r, g, b = col
	sense.set_pixel(0, player_position-1, r, g, b)
	sense.set_pixel(0, player_position, r, g, b)
	sense.set_pixel(0, player_position+1, r, g, b)

ball_move_delay = 250 # number of milliseconds before the ball moves

def ball_play():
	global next_time_to_move
	if next_time_to_move == 0:
		next_time_to_move = now()
	if next_time_to_move <= now():
		next_time_to_move = next_time_to_move + ball_move_delay
		ball_position[0] += ball_velocity[0]
		ball_position[1] += ball_velocity[1]

		if ball_position[1] == 0 or ball_position[1] == 7:
			ball_velocity[1] = -ball_velocity[1]
		if ball_position[0] == 7:
			ball_velocity[0] = -ball_velocity[0]
		if ball_position[0] == 1 and\
				player_position-1 <= ball_position[1] <= player_position+1:
			ball_velocity[0] = -ball_velocity[0]
		if ball_position[0] == 0:
			return False

	sense.set_pixel(ball_position[0], ball_position[1], 255, 0, 0)
	return True

def main(argv):
	try:
		opts, args = getopt.getopt(argv, "h", ["--help"])
	except getopt.GetoptError:
		help()
		sys.exit(2)

	for opt, arg in opts:
		if opt in ('-h', '--help'):
			help()
			sys.exit()

	if len(args) == 0:
		help()
		sys.exit(2)

	model = args[0]

	dir_path = os.path.dirname(os.path.realpath(__file__))
	modelfile = os.path.join(dir_path, model)

	print('MODEL: ' + modelfile)

	with ImageImpulseRunner(modelfile) as runner:
		try:
			model_info = runner.init()
			print('Loaded runner for "' + model_info['project']['owner'] + ' / ' + model_info['project']['name'] + '"')
			labels = model_info['model_parameters']['labels']
			if len(args)>= 2:
				videoCaptureDeviceId = int(args[1])
			else:
				port_ids = get_webcams()
				if len(port_ids) == 0:
					raise Exception('Cannot find any webcams')
				if len(args)<= 1 and len(port_ids)> 1:
					raise Exception("Multiple cameras found. Add the camera port ID as a second argument to use to this script")
				videoCaptureDeviceId = int(port_ids[0])

			camera = cv2.VideoCapture(videoCaptureDeviceId)
			ret = camera.read()[0]
			if ret:
				backendName = camera.getBackendName()
				w = camera.get(3)
				h = camera.get(4)
				print("Camera %s (%s x %s) in port %s selected." %(backendName,h,w, videoCaptureDeviceId))
				camera.release()
			else:
				raise Exception("Couldn't initialize selected camera.")

			global next_time_to_move
			next_time_to_move = 0
			player = None

			for res, img in runner.classifier(videoCaptureDeviceId):

				sense.clear(0, 0, 0)

				batCol = (255, 255, 0)
				# optionally colour the bat white if the object is not detected
				# batCol = (255, 255, 255)

				if "bounding_boxes" in res["result"].keys():
					for bb in res["result"]["bounding_boxes"]:
						if player == None:
							player = bb['label']
							if player == 'bee':
								bee.animate(3.6)
							elif player == 'arma':
								arma.animate(3.6)
						if bb['label'] == player:
							global player_position
							player_position = min(max(bb['y'] // 8 - 1, 1), 6)
							batCol = (255, 255, 0)

				if player == None:
					continue

				cont = ball_play()
				if not cont: break
				draw_bat(batCol)

			sense.show_message("Game Over", text_colour=(255, 0, 0))

		finally:
			if (runner):
				runner.stop()

if __name__ == "__main__":
   main(sys.argv[1:])

#!/usr/bin/env python

#import device_patches       # Device specific patches for Jetson Nano (needs to be before importing cv2)

import cv2
import os
import sys, getopt
import signal
import time
from edge_impulse_linux.image import ImageImpulseRunner
from sense_hat import SenseHat
from time import sleep

sense = SenseHat()

# class and objects to show the players

class Player:
	def __init__(self, colours, picture):
		self.picture = [[colours[ord(id) - 48] for id in row] for row in picture]
		self.height = len(self.picture)
		self.width = len(self.picture[0])
		self.y_move = self.height - 8
		self.x_move = self.width - 8
		self.frames = (self.y_move + self.x_move) * 2
	
	def draw(self, frame):
		frame %= self.frames
		xShift = frame if frame < self.x_move else\
			self.x_move if frame < self.x_move + self.y_move else\
			self.x_move * 2 + self.y_move - frame if frame < self.x_move * 2 + self.y_move else\
			0
		yShift = 0 if frame < self.x_move else\
			frame - self.x_move if frame < self.x_move + self.y_move else\
			self.y_move if frame < self.x_move * 2 + self.y_move else\
			self.x_move * 2 + self.y_move * 2 - frame
		to_set = []
		for i in range(yShift, yShift + 8):
			to_set += self.picture[i][xShift:xShift + 8]
		sense.set_pixels(to_set)
	
	def animate(self, duration):
		frame = 0
		duration += time.time()
		while time.time() < duration:
			self.draw(frame)
			frame += 1
			sleep(min(duration - time.time(), .15))

	def mid(self, x_, y_):
		to_set = []
		for i in range(y_, y_ + 8):
			to_set += self.picture[i][x_:x_ + 8]
		sense.set_pixels(to_set)

bee = Player([
	(200, 200, 0), # 0 = yellow
	(0, 0, 200), # 1 = blue
	(200, 200, 200), # 2 = white
	(0, 0, 0)], # 3 = black
	["11133333111",
	"11322322311",
	"11132232311",
	"11113333311",
	"11133030031",
	"11303030003",
	"33303030303",
	"11303030003",
	"11133030031",
	"11113333311",
	"11111313111"])

arma = Player([
	(0, 0, 200), # 0 = blue
	(200, 0, 0), # 1 = red
	(200, 200, 200), # 2 = white
	(200, 140, 0), # 3 = orange
	(0, 0, 0), # 4 = black
	(60, 40, 20)], # 5 = chocolate
	["55555555535",
	"55555555335",
	"55555553133",
	"55555503113",
	"55555003130",
	"55500033333",
	"50003332233",
	"11333324223",
	"13333332233",
	"55333333333",
	"55554433335"])

runner = None
# if you don't want to see a camera preview, set this to False
show_camera = True
if (sys.platform == 'linux' and not os.environ.get('DISPLAY')):
	show_camera = False

def now():
	return round(time.time() * 1000)

def get_webcams():
	port_ids = []
	for port in range(5):
		print("Looking for a camera in port %s:" %port)
		camera = cv2.VideoCapture(port)
		if camera.isOpened():
			ret = camera.read()[0]
			if ret:
				backendName = camera.getBackendName()
				w = camera.get(3)
				h = camera.get(4)
				print("Camera %s (%s x %s) found in port %s " %(backendName,h,w, port))
				port_ids.append(port)
			camera.release()
	return port_ids

def sigint_handler(sig, frame):
	print('Interrupted')
	if (runner):
		runner.stop()
	sys.exit(0)

signal.signal(signal.SIGINT, sigint_handler)

def help():
	print('python classify.py <path_to_model.eim> <Camera port ID, only required when more than 1 camera is present>')

ball_position=[6, 3]
ball_velocity=[-1, -1]
bee_position = 4
arma_position = 4

def draw_bats():
	sense.set_pixel(0, bee_position-1, 255, 255, 0)
	sense.set_pixel(0, bee_position, 255, 255, 0)
	sense.set_pixel(0, bee_position+1, 255, 255, 0)

	sense.set_pixel(7, arma_position-1, 0, 0, 255)
	sense.set_pixel(7, arma_position, 0, 0, 255)
	sense.set_pixel(7, arma_position+1, 0, 0, 255)

ball_move_delay = 300

def ball_play():
	global next_time_to_move
	if next_time_to_move == 0:
		next_time_to_move = now()
	if next_time_to_move <= now():
		next_time_to_move = next_time_to_move + ball_move_delay
		ball_position[0] += ball_velocity[0]
		ball_position[1] += ball_velocity[1]

		if ball_position[1] == 0 or ball_position[1] == 7:
			ball_velocity[1] = -ball_velocity[1]
		if ball_position[0] == 6 and\
				arma_position-1 <= ball_position[1] <= arma_position+1:
			ball_velocity[0] = -ball_velocity[0]
		if ball_position[0] == 1 and\
				bee_position-1 <= ball_position[1] <= bee_position+1:
			ball_velocity[0] = -ball_velocity[0]
		if ball_position[0] == 0:
			return "arma"
		if ball_position[0] == 7:
			return "bee"

	sense.set_pixel(ball_position[0], ball_position[1], 255, 0, 0)
	return "continue"

def main(argv):
	try:
		opts, args = getopt.getopt(argv, "h", ["--help"])
	except getopt.GetoptError:
		help()
		sys.exit(2)

	for opt, arg in opts:
		if opt in ('-h', '--help'):
			help()
			sys.exit()

	if len(args) == 0:
		help()
		sys.exit(2)

	model = args[0]

	dir_path = os.path.dirname(os.path.realpath(__file__))
	modelfile = os.path.join(dir_path, model)

	print('MODEL: ' + modelfile)

	with ImageImpulseRunner(modelfile) as runner:
		try:
			model_info = runner.init()
			print('Loaded runner for "' + model_info['project']['owner'] + ' / ' + model_info['project']['name'] + '"')
			labels = model_info['model_parameters']['labels']
			if len(args)>= 2:
				videoCaptureDeviceId = int(args[1])
			else:
				port_ids = get_webcams()
				if len(port_ids) == 0:
					raise Exception('Cannot find any webcams')
				if len(args)<= 1 and len(port_ids)> 1:
					raise Exception("Multiple cameras found. Add the camera port ID as a second argument to use to this script")
				videoCaptureDeviceId = int(port_ids[0])

			camera = cv2.VideoCapture(videoCaptureDeviceId)
			ret = camera.read()[0]
			if ret:
				backendName = camera.getBackendName()
				w = camera.get(3)
				h = camera.get(4)
				print("Camera %s (%s x %s) in port %s selected." %(backendName,h,w, videoCaptureDeviceId))
				camera.release()
			else:
				raise Exception("Couldn't initialize selected camera.")

			bee.animate(1.8)
			arma.animate(1.8)
			global next_time_to_move
			next_time_to_move = 0

			status = 0

			for res, img in runner.classifier(videoCaptureDeviceId):

				sense.clear(0, 0, 0)

				if "bounding_boxes" in res["result"].keys():
					for bb in res["result"]["bounding_boxes"]:
						if bb['label'] == 'arma':
							global arma_position
							arma_position = min(max(bb['y'] // 8 - 1, 1), 6)
						elif bb['label'] == 'bee':
							global bee_position
							bee_position = min(max(bb['y'] // 8 - 1, 1), 6)

				status = ball_play()
				if status != "continue": break
				draw_bats()

			if status == "arma":
				arma.animate(3.6)
			elif status == "bee":
				bee.animate(3.6)

			sense.show_message("Game Over", text_colour=(255, 0, 0))

		finally:
			if (runner):
				runner.stop()

if __name__ == "__main__":
   main(sys.argv[1:])

Credits

Jallson Suryo

5 projects • 26 followers

Tech integrator for schools. Also works as a maker and his activities include disassembling, fixing, and making things.

Thanks to Nicholas Patrick.

AR Pong Game with Object Detection

Things used in this project

Hardware components

Software apps and online services

Story

This project consists of 5 steps:

Step 1: Preparation

Step 2: Data acquisition and labelling

Step 3: Training and building model using FOMO Object Detection

Step 4: Deploy the trained model and test it on the Raspberry Pi

Schematics

Setup diagram

Code

pong_1_objects.py

pong_2_objects.py

Credits

Jallson Suryo

Comments

Embed the widget on your own site

AR Pong Game with Object Detection

AR Pong Game with Object Detection

Things used in this project

Hardware components

Software apps and online services

Story

This project consists of 5 steps:

Step 1: Preparation

Step 2: Data acquisition and labelling

Step 3: Training and building model using FOMO Object Detection

Step 4: Deploy the trained model and test it on the Raspberry Pi

Schematics

Setup diagram

Code

pong_1_objects.py

pong_2_objects.py

Credits

Jallson Suryo

Comments

Related channels and tags