Team shoot_it:

•

Published December 3, 2019 © MIT

Shoot Your Shot!

This computer vision booth analyzes users throwing darts from multiple cameras, scoring each dart before logging data to the cloud.

IntermediateShowcase (no instructions)8 hours6,138

Things used in this project

Hardware components

NVIDIA Jetson Nano Developer Kit

Intel RealSense Camera

Software apps and online services

OpenCV

PyTorch

trt_pose

Story

See what's new at Remyx.ai

Check out our YouTube video of this project!

Darts Throwback

We were stuck inside on the first rainy days of SF's winter and decided to revisit an old project called "Shoot Your Shot."

We built this application for a client to demo their data streaming platform interactively with conference attendees in their booth.

We used powerful machine vision cameras to acquire at high frame rates while maintaining high image resolution.

The demo tracked darts in flight toward a dartboard with a camera and estimated dart placement for scoring. We choose a perspective over the thrower's right shoulder since most people are right handed.

Bullseye!

At the same time, we want to analyze the pose of the thrower to present information related to body mechanics and game performance.

Again, we frame the shot to take advantage of a typical right-handed thrower's stance.

From these two camera perspectives, we derive many other views and statistics using computer vision.

In the end, our demo application looked like:

nice shot!

Fun Fact: Forearm length doesn't vary all that much from person-to-person! In fact, the cubit is an ancient unit of distance based on this observation. Because we estimate body keypoints, the cubit is also the unit by which we measure distance to estimate dart velocity above!

To fully leverage these powerful cameras, we experimented with the NVIDIA TX2 and the new Coral Dev Board (EdgeTPU). We even created some resources around setting up OpenCV, ROS, and other useful tools for this new hardware.

Rainy Day Hacks

For our home version, we ordered a classic Vogelpik dartboard from here.

This time, we ran processing with the Jetson Nano.

Also, we found the Intel RealSense D415 DepthCam and the Varifocal ELP 2 Megapixel USB cameras to be powerful substitutions for the FLIR cameras.

We've found that using multiple camera angles helps ensure adequate coverage for high action sequences. So we use 3 additional webcams along with 2 GoPros AND a Nikon DSLR to document the events of our hackathon.

To manage all 5 USB cameras, we trigger all simultaneously by broadcasting messages using MQTT at the push of button.

trigger cameras to start/stop with a simple button

Tracking Darts with OpenCV

Beginning with a tight frame around the dartboard, we make a qualitative review of the camera's supported resolution/framerates with tools like Cheese.

You can get a more detailed view of your camera's specs by running:

v4l2-ctl --list-formats-ext --device /dev/video0

For such a tightly focused shot, the primary visual cue we will use to track darts is motion. Background Subtraction allowed us to isolate the regions of the image featuring a dart, even helping us to filter out shadows.

Detecting Small Dark Blobs

Next, by performing circle detection, we can determine where the board is relative to the framing of the shot while remaining robust to small changes in demo setup.

Then we can tune parameters in a blob detection module to track darts with acceptable precision/recall. Here, we accept more false positives to keep high recall. This works because we are using simple, lightweight processing and can budget additional compute to improve precision.

Next, by computing the distance between the dart and the center of the circle, we have an important clue for estimating the score of the dart throw.

After many trials, we can visualize the changing distribution of dart placements for a mesmerizing view of our play over time.

We save small image crops around the blob to review dart detection performance. Using a convenient file naming convention, we encode important information relating time and blob/bullseye positions.

Reviewing these image crops in a jupyter notebook, we can consider heuristics relating dart distance from bullseye, absolute location, and local image features to reduce false positives.

Similarly, we can train a classifier to return a score based on this information.

Estimating Pose with the Nano

To analyze the player's form, we use pose estimation to track body parts through a throwing session.

The original "Shoot Your Shot" used a popular pose estimation repo. By estimating joint angles and forearm length, we did things like: tracking darts to estimate velocity, and analyzing throwing form.

Recently, the NVIDIA developer team released trt_pose which uses TensorRT to accelerate pose estimation up to 22 FPS on the Nano. In our repo, we've included scripts to run the optimized model with a usb camera and the RealSense D415 camera.

Gathering the data

Now that all the cameras can be synchronized and we can analyze the dart board as well as the player's throw, we shot many darts to gather lots of training data.

We experimented with using AWS Kinesis for Video Streaming on our pi's for acquiring video. Using S3 and DynamoDB, we were able to store numerical data and images.

Keeping Score

With the small image crops around the blobs we detected using background subtraction, we can train a machine learning algorithm to score the dart throw.

For this particular game, scoring works so that throws landing on the:

largest red ring score 5 points
next smaller neighboring white ring scores 10 points
next smaller blue ring 15 points
smallest white ring 20 points
smallest red ring 25 points
bullseye 50 points

By grouping similar scoring image crops into the same directories, we could easily frame an image classification task.

In our notebook, we experiment with the scikit-image implementation of Histogram of Oriented Gradients to obtain a simple image feature for our simple images.

Since we have additional context in the form of the dart's distance from the bullseye, we concatenate our HOG feature description vector and the normalized radial distance.

The resulting vector will encode both visual cues like color and texture as well as distance from the bullseye to help our model predict the contribution of the throw to our score.

With simple images and features, we turn to a lightweight and simple machine learning algorithm: gradient boosting, in particular, using the xgboost library.

Next, we see the processing is light enough for real time scoring.

caught the RobinHood!

Adding more data and refining our baseline models, we can develop a more accurate application to track darts & count the contribution to overall score for each throw.

Here we see the primary challenge in scoring is reduced to deduplicating blob detection results that have different scores.

To accomplish this, we introduce a couple data structures to help track collisions in space-time of detected blobs. The additional bookkeeping also helps to easily smooth the computed score with the median over these observations and estimates.

Then if we track multiple objects, we can add the total score once 4 darts have been thrown and attribute this score to the current thrower in a game.

Future Directions

Using the body key points collected, we could try to learn an association between a player's stance or movement and their score or other performance related stats.

Creating a simple application was the first step to gathering lots of training data. With lots of samples, many models can be trained to improve our game.

We've included a repo with the code we used to run this demo. Happy tinkering!