Check out our YouTube video of this project!
We were stuck inside on the first rainy days of SF's winter and decided to revisit an old project called "Shoot Your Shot."
We built this application for a client to demo their data streaming platform interactively with conference attendees in their booth.
We used powerful machine vision cameras to acquire at high frame rates while maintaining high image resolution.
The demo tracked darts in flight toward a dartboard with a camera and estimated dart placement for scoring. We choose a perspective over the thrower's right shoulder since most people are right handed.
At the same time, we want to analyze the pose of the thrower to present information related to body mechanics and game performance.
Again, we frame the shot to take advantage of a typical right-handed thrower's stance.
From these two camera perspectives, we derive many other views and statistics using computer vision.
In the end, our demo application looked like:
Fun Fact: Forearm length doesn't vary all that much from person-to-person! In fact, the cubit is an ancient unit of distance based on this observation. Because we estimate body keypoints, the cubit is also the unit by which we measure distance to estimate dart velocity above!
To fully leverage these powerful cameras, we experimented with the NVIDIA TX2 and the new Coral Dev Board (EdgeTPU). We even created some resources around setting up OpenCV, ROS, and other useful tools for this new hardware.
For our home version, we ordered a classic Vogelpik dartboard from here.
This time, we ran processing with the Jetson Nano.
We've found that using multiple camera angles helps ensure adequate coverage for high action sequences. So we use 3 additional webcams along with 2 GoPros AND a Nikon DSLR to document the events of our hackathon.
To manage all 5 USB cameras, we trigger all simultaneously by broadcasting messages using MQTT at the push of button.
Beginning with a tight frame around the dartboard, we make a qualitative review of the camera's supported resolution/framerates with tools like Cheese.
You can get a more detailed view of your camera's specs by running:
v4l2-ctl --list-formats-ext --device /dev/video0
For such a tightly focused shot, the primary visual cue we will use to track darts is motion. Background Subtraction allowed us to isolate the regions of the image featuring a dart, even helping us to filter out shadows.
Next, by performing circle detection, we can determine where the board is relative to the framing of the shot while remaining robust to small changes in demo setup.
Then we can tune parameters in a blob detection module to track darts with acceptable precision/recall. Here, we accept more false positives to keep high recall. This works because we are using simple, lightweight processing and can budget additional compute to improve precision.
Next, by computing the distance between the dart and the center of the circle, we have an important clue for estimating the score of the dart throw.
After many trials, we can visualize the changing distribution of dart placements for a mesmerizing view of our play over time.
We save small image crops around the blob to review dart detection performance. Using a convenient file naming convention, we encode important information relating time and blob/bullseye positions.
Reviewing these image crops in a jupyter notebook, we can consider heuristics relating dart distance from bullseye, absolute location, and local image features to reduce false positives.
Similarly, we can train a classifier to return a score based on this information.
To analyze the player's form, we use pose estimation to track body parts through a throwing session.
The original "Shoot Your Shot" used a popular pose estimation repo. By estimating joint angles and forearm length, we did things like: tracking darts to estimate velocity, and analyzing throwing form.
Recently, the NVIDIA developer team released trt_pose which uses TensorRT to accelerate pose estimation up to 22 FPS on the Nano. In our repo, we've included scripts to run the optimized model with a usb camera and the RealSense D415 camera.
Now that all the cameras can be synchronized and we can analyze the dart board as well as the player's throw, we shot many darts to gather lots of training data.
We experimented with using AWS Kinesis for Video Streaming on our pi's for acquiring video. Using S3 and DynamoDB, we were able to store numerical data and images.
With the small image crops around the blobs we detected using background subtraction, we can train a machine learning algorithm to score the dart throw.
For this particular game, scoring works so that throws landing on the:
- largest red ring score 5 points
- next smaller neighboring white ring scores 10 points
- next smaller blue ring 15 points
- smallest white ring 20 points
- smallest red ring 25 points
- bullseye 50 points
By grouping similar scoring image crops into the same directories, we could easily frame an image classification task.
Since we have additional context in the form of the dart's distance from the bullseye, we concatenate our HOG feature description vector and the normalized radial distance.
The resulting vector will encode both visual cues like color and texture as well as distance from the bullseye to help our model predict the contribution of the throw to our score.
With simple images and features, we turn to a lightweight and simple machine learning algorithm: gradient boosting, in particular, using the xgboost library.
Next, we see the processing is light enough for real time scoring.
Adding more data and refining our baseline models, we can develop a more accurate application to track darts & count the contribution to overall score for each throw.
Here we see the primary challenge in scoring is reduced to deduplicating blob detection results that have different scores.
To accomplish this, we introduce a couple data structures to help track collisions in space-time of detected blobs. The additional bookkeeping also helps to easily smooth the computed score with the median over these observations and estimates.
Then if we track multiple objects, we can add the total score once 4 darts have been thrown and attribute this score to the current thrower in a game.
Using the body key points collected, we could try to learn an association between a player's stance or movement and their score or other performance related stats.
Creating a simple application was the first step to gathering lots of training data. With lots of samples, many models can be trained to improve our game.
We've included a repo with the code we used to run this demo. Happy tinkering!