In this project I will show how to download & run a pose detection model on a MaaXBoard 8ULP streaming live video. I'll first cover setting up a python web UI to access the video stream, then deploy the model to the board, and finally render the model results to the web server.
This project is a follow on to the first article, Rapid UI Prototyping with MaaXBoard 8ULP & PySimpleGuiWeb, which shows how to create web UIs on headless systems. We'll use that article as a foundation to start up the web server and incorporate openCV for the video capture.
This article won't cover machine learning topics in depth. Rather it's intention is to show how you can execute open source TensorFlow Lite models on the NXP i.MX8ULP platform and leverage the same process for other models.
Prerequisites and ReferencesIt's assumed the user has installed an image onto the board's eMMC and has PySimpleGuiWeb framework installed. If not, there are existing articles that cover this in detail such as Getting Started with MaaXBoard. Below are also helpful links for getting started with the MaaXBoard 8ULP.
MaaXBoard 8ULP Reference Materia
- MaaXBoard 8ULP Product Page
- MaaXBoard 8ULP HUB Page
- MaaXBoard 8ULP Github
- MaaXBoard 8ULP Linux/Yocto User Manual
PySimpleGui Library References:
Tensorflow Pose Detection Model (Movenet):
Tensorflow Lite with Python:
Other References:
Hardware SetupTo get started, connect USB-C cables to MaaXBoard 8ULP, provide an Ethernet connection (required), and either a MIPI-CSI or USB web camera. You can use either setup for this demo. I will be using the USB web camera.
Once power is supplied to the board, connect to the board with SSH via VSCode. The IP address of the board can be obtained using PuTTY and the debug port. See this article if you run into issues: Rapid UI Prototyping with MaaXBoard 8ULP & PySimpleGuiWeb.
Ethernet is required in this setup as we'll be installing TensorFlow Lite Python library to run inferences on the TensorFlow Lite posture detection (Movenet) model.
Setup & Install Tensorflow LiteTo quickly execute TensorFlow lite machine learning models on the MaaXBoard 8ULP, we will need to install the TensorFlow Lite runtime package referenced in this link: Quickstart for Linux-based devices with Python.
TensorFlow lite is a heavily reduced packaged set from Tensorflow and will allow us to run the pose detection model in this project on the target board without consuming a large amount of memory.
To install TFLite, connect the MaaXBoard 8ULP to the internet via Ethernet, and run the following command via PuTTY or VSCode.
python3 -m pip install tflite-runtime
The version I have installed for this project is tflite-runtime-2.14.0.
Download Movenet Pose Detection ModelWith TFLite now installed let's download the pose detection model from TensorFlow Hub (now merged with Kaggle). I will be downloading this model on the host PC and transferring to the target board.
The model can be found here: Movenet Model. I also provided links in the resources section. In the link you'll see options to download a TensorFlow2, TFLite, and TfJs model. Select TFLite, and under variation, select singlepose-lightning-tflite-int8 and click the download button.
A description of the model will be provided below after the variation has been selected. I chose Singlepose/Lightning for this project since I'm only looking to handle pose detection for one person (myself) and lightning because it's a lower capacity model.
After downloading and extracting the model, you may notice the model name is 4.tflite or similar. For the sake of the project I'm going to rename the model as shown below to make it more clear what model we've chosen.
lite-model_movenet_singlepose_lightning_tflite_int8_4.tflite
Important: We will have to make references to the model path in Python. If you choose not to rename the file, the project code will still work you will just need to update that section of the code which I will highlight as we go along.
Build the UINow that TFLite is installed and the model has been downloaded, create a directory for this project on the MaaXBoard 8ULP. I've called this directory PostureDetectionDemo.
Create a Python file in this directory called postureDemo.py and copy over the model file into the directory. Note again, if you did not rename the model file, it will just be 4.tflite.
Open the postureDemo.py file and copy the following code into it.
# [1]
import time
import cv2
import PySimpleGUIWeb as sg
import numpy as np
# Image/Camera Parameters
HEIGHT = 480
WIDTH = 640
FPS = 30
# [2]
def displayBlankImage():
img = np.full((480, 640), 255)
imgbytes = cv2.imencode(".png", img)[1].tobytes()
window["-IMAGE1-"].update(data=imgbytes)
# [3]
sg.theme('Black')
frame_layout = [
[sg.Text("Video Output")],
[sg.Image(filename='', key='-IMAGE1-')]
]
# Layout of UI
layout = [
[
sg.Text('MaaXBoard 8ULP Posture Detect Demo',
size=(40, 1),
justification='left',
font='Helvetica 20')],
[
sg.Button('Record', size=(10, 1), font='Helvetica 14', key='-RECORD-'),
sg.Button('Stop', size=(10, 1), font='Any 14', key='-STOP-'),
sg.Button('Exit', size=(10, 1), font='Helvetica 14', key='-EXIT-'),
],
[sg.Frame("TF Lite Model", frame_layout,font="Any 14", title_color="blue")],
]
# [4]
window = sg.Window('Demo Application', layout, location=(800, 400), web_port=5555, web_start_browser=False)
recording = False
# [5]
while True:
event, values = window.read(timeout=500)
# Browser closed or exit tapped
if event == "-EXIT-" or event == sg.WIN_CLOSED:
break
# Record button tapped
elif event == "-RECORD-":
# Setup Camera Parameters
cap = cv2.VideoCapture(0)
cap.set(cv2.CAP_PROP_FRAME_WIDTH, 640)
cap.set(cv2.CAP_PROP_FRAME_HEIGHT, 480)
cap.set(cv2.CAP_PROP_FPS, 30)
recording = True
# Stop button tapped
elif event == "-STOP-":
recording = False
# Display a dummy blank image
displayBlankImage()
cap.release()
# Live recording/streaming
if recording:
try:
ret, frame = cap.read()
if ret: # Camera available
imgbytes = cv2.imencode(".png", frame)[1].tobytes()
window["-IMAGE1-"].update(data=imgbytes)
except Exception as e:
print(e)
print(f"caught {type(e)}: e")
break
else:
displayBlankImage()
cap.release()
window.close()
CodeWalk-through
- [1] - Import libraries used to run PySimpleGuiWeb server, openCV, and Numpy for matrix/image manipulation. Define height, width, and frames per second for the camera.
- [2] - Define a simple blank, white image display function.
- [3] - Define the UI layout. First create frame_layout which will contain a separate frame with title and video output. Next define the window layout, with title, three buttons, and finally adding in the frame_layout created initially which will sit under the button section.
- [4] - Create the window object with port address 5555. So when connecting to the PySimpleGuiWeb server, it will be <IP_Addr>:5555. On my board, this is 192.168.1.161:5555
- [5] - Setup an event loop for the UI and monitor changes every 500mS. Button events trigger either exiting the UI, beginning image streaming, or stopping streaming.
Run the following code on the target and connect to the boards IP address with port 5555. You should see the following display.
Tap record to begin streaming via the attached MIPI-CSI or USB web camera. If all goes well you should see video output being displayed on the PySimpleGuiWeb server.
Adding the Posture Detection ModelTo this point we've gone through setting up the board, installing the TensorFlow Lite library, downloading the model, and quickly building out a UI to access the camera and display the video feed. In this section I'll cover adding in the model, setting up an interpreter to run the model, and displaying the results on the same video feed as before.
Before re-writing the short program to add in inferencing, it's helpful to understand the model inputs and outputs. For this, please reference the Movenet model card, which can be found in the Movenet model download link.
The model card provides information about the type of model, input shape, and output details it will provide. In this case we are using a the light weight model (Lightning vs Thunder), and need to supply it an image of size 192x192 with a RGB channel of [0-255]. On the output side, it will return a float32 of shape [1, 1, 17, 3]. This will be for the 17 keypoint locations (elbow, wrist, nose, etc), each of which will have an y, x coordinate along with a confidence value. With this in mind, let's update the previous code.
# [1]
import tflite_runtime.interpreter as tflite
model_path = '/home/PostureDetectionDemo/lite-model_movenet_singlepose_lightning_tflite_int8_4.tflite'
# [2]
# Maps bones to a matplotlib color name.
EDGES = {
(0, 1): 'm',
(0, 2): 'c',
(1, 3): 'm',
(2, 4): 'c',
(0, 5): 'm',
(0, 6): 'c',
(5, 7): 'm',
(7, 9): 'm',
(6, 8): 'c',
(8, 10): 'c',
(5, 6): 'y',
(5, 11): 'm',
(6, 12): 'c',
(11, 12): 'y',
(11, 13): 'm',
(13, 15): 'm',
(12, 14): 'c',
(14, 16): 'c'
}
# [3]
def draw_connections(frame, keypoints, edges, confidence_threshold):
y, x, c = frame.shape
shaped = np.squeeze(np.multiply(keypoints, [y, x, 1]))
for edge, color in edges.items():
p1, p2 = edge
y1, x1, c1 = shaped[p1]
y2, x2, c2 = shaped[p2]
if (c1 > confidence_threshold) & (c2 > confidence_threshold):
cv2.line(frame, (int(x1), int(y1)), (int(x2), int(y2)), (0, 0, 255), 2)
def draw_keypoints(frame, keypoints, confidence_threshold):
y, x, c = frame.shape
shaped = np.squeeze(np.multiply(keypoints, [y, x, 1]))
for kp in shaped:
ky, kx, kp_conf = kp
if kp_conf > confidence_threshold:
cv2.circle(frame, (int(kx), int(ky)), 4, (0, 255, 0), -1)
[4]
def run_tflite_model(input_frame):
interpreter = tflite.Interpreter(model_path=model_path)
interpreter.allocate_tensors()
input_details = interpreter.get_input_details()
output_details = interpreter.get_output_details()
# resize image to (192, 192) for model
img = cv2.resize(input_frame, (192, 192))
input_data = np.expand_dims(img, axis=0)
interpreter.set_tensor(input_details[0]['index'], input_data)
interpreter.invoke()
# [1,1,17,3] data format; last dimension contains the y,x,confidence values
keypoints = interpreter.get_tensor(output_details[0]['index'])
# Render Keypoints/Connections
draw_connections(frame, keypoints, EDGES, 0.4)
draw_keypoints(frame, keypoints, 0.4)
imgBytes = cv2.imencode('.png', frame)[1].tobytes()
return imgBytes
- [1] - Add the tflite_runtime library and add in a path to your model.
- [2] - Add in a dictionary that maps all 17 keypoints and their relative locations to one another. This will be used to render connection points between key points and is borrowed directly from the Google Colab notebook as part of the Movenet model in Kaggle.
- [3] - Add in these two drawing functions; one to add visuals on the keypoints that meet a confidence criteria, and the other to form connections between those keypoints.
- [4] - This function will take in the an input frame, setup an interpreter, and execute the model on the image. There is a resizing that is required to scale the image from 640x480 to 192x192. We expand the dimensions of the image so it can be properly used as an input into the model. We invoke the interpreter to get output results, in this case keypoints. The keypoints are fed into the two drawing functions along with the frame data, which is then finally passed back to the UI as image data to be displayed.
Now with the TFLite portion added, we need to update the UI to call the run_tflite_model() method and display the results live.
# Live recording/streaming
if recording:
try:
ret, frame = cap.read()
if ret: # Camera available
# [5]
tflite_model_imgbytes = run_tflite_model(frame)
window["-IMAGE1-"].update(data=tflite_model_imgbytes)
except Exception as e:
print(e)
print(f"caught {type(e)}: e")
break
else:
displayBlankImage()
This is the same portion of code that was previously used to render the video feed from our camera with the exception of code under [5].
- [5] - replace the previous portion of code with a new output from the TFLite model. The output is already setup to be handled by PySimpleGUIWeb, so we just pass the new data into -IMAGE1-.
Final code with posture detection model and interpreter added:
import time
import cv2
import PySimpleGUIWeb as sg
import numpy as np
import tflite_runtime.interpreter as tflite
model_path = '/home/PostureDetectionDemo/lite-model_movenet_singlepose_lightning_tflite_int8_4.tflite'
HEIGHT = 480
WIDTH = 640
FPS = 30
def displayBlankImage():
img = np.full((HEIGHT, WIDTH), 255)
imgbytes = cv2.imencode(".png", img)[1].tobytes()
window["-IMAGE1-"].update(data=imgbytes)
# Maps bones to a matplotlib color name.
EDGES = {
(0, 1): 'm',
(0, 2): 'c',
(1, 3): 'm',
(2, 4): 'c',
(0, 5): 'm',
(0, 6): 'c',
(5, 7): 'm',
(7, 9): 'm',
(6, 8): 'c',
(8, 10): 'c',
(5, 6): 'y',
(5, 11): 'm',
(6, 12): 'c',
(11, 12): 'y',
(11, 13): 'm',
(13, 15): 'm',
(12, 14): 'c',
(14, 16): 'c'
}
def draw_connections(frame, keypoints, edges, confidence_threshold):
y, x, c = frame.shape
shaped = np.squeeze(np.multiply(keypoints, [y, x, 1]))
for edge, color in edges.items():
p1, p2 = edge
y1, x1, c1 = shaped[p1]
y2, x2, c2 = shaped[p2]
if (c1 > confidence_threshold) & (c2 > confidence_threshold):
cv2.line(frame, (int(x1), int(y1)), (int(x2), int(y2)), (0, 0, 255), 2)
def draw_keypoints(frame, keypoints, confidence_threshold):
y, x, c = frame.shape
shaped = np.squeeze(np.multiply(keypoints, [y, x, 1]))
for kp in shaped:
ky, kx, kp_conf = kp
if kp_conf > confidence_threshold:
cv2.circle(frame, (int(kx), int(ky)), 4, (0, 255, 0), -1)
def run_tflite_model(input_frame):
interpreter = tflite.Interpreter(model_path=model_path)
interpreter.allocate_tensors()
input_details = interpreter.get_input_details()
output_details = interpreter.get_output_details()
# resize image to (192, 192) for model
img = cv2.resize(input_frame, (192, 192))
input_data = np.expand_dims(img, axis=0)
interpreter.set_tensor(input_details[0]['index'], input_data)
interpreter.invoke()
# [1,1,17,3] data shape; last dimension contains the y,x,confidence values
keypoints = interpreter.get_tensor(output_details[0]['index'])
# Render Keypoints/Connections
draw_connections(frame, keypoints, EDGES, 0.4)
draw_keypoints(frame, keypoints, 0.4)
imgBytes = cv2.imencode('.png', frame)[1].tobytes()
return imgBytes
sg.theme('Black')
frame_layout1 = [
[sg.Text("Video Output")],
[sg.Image(filename='', key='-IMAGE1-')]
]
layout = [
[sg.Text('MaaXBoard 8ULP Posture Detect Demo', size=(40, 1), justification='left', font='Helvetica 20')],
[
sg.Button('Record', size=(10, 1), font='Helvetica 14', key='-RECORD-'),
sg.Button('Stop', size=(10, 1), font='Any 14', key='-STOP-'),
sg.Button('Exit', size=(10, 1), font='Helvetica 14', key='-EXIT-'),
],
[sg.Frame("TF Lite Model", frame_layout1, font="Any 14", title_color="blue")],
]
# create the window and show it without the plot
window = sg.Window('Demo Application',
layout, location=(800, 400), web_port=5555, web_start_browser=False)
recording = False
while True:
event, values = window.read(timeout=500)
#print(event, values)
if event == '-EXIT-' or event == sg.WIN_CLOSED:
break
elif event == '-RECORD-':
cap = cv2.VideoCapture(0)
cap.set(cv2.CAP_PROP_FRAME_WIDTH, WIDTH)
cap.set(cv2.CAP_PROP_FRAME_HEIGHT, HEIGHT)
cap.set(cv2.CAP_PROP_FPS, FPS)
recording = True
elif event == '-STOP-':
recording = False
displayBlankImage()
cap.release()
if recording:
try:
ret, frame = cap.read()
if ret:
tflite_model_imgbytes = run_tflite_model(frame)
window['-IMAGE1-'].update(data=tflite_model_imgbytes)
except Exception as e:
print(e)
print(f'caught {type(e)}: e')
cap.release()
window.close()
else:
displayBlankImage()
cap.release()
window.close()
Posture Detection ResultWith the new changes added to the previous code, let's re-run the application and begin recording.
We can see the model accurately calculating the position of the wrist, elbow, and facial points as expected. The model will also show lower extremity keypoints if they are available through the frame input.
SummaryIn this project we showed how you can incorporate an open source posture detection model on the MaaXBoard 8ULP and use PySimpleGuiWeb to quickly build a web UI to render the results.
There are other ways of efficiently handling image resizing and visualizing keypoints on the image, but I hope this provided a way to quickly incorporate and exercise a TensorFlow lite model for evaluation into your projects.
Comments