In part one of the Dobble Challenge tutorial, we learned how the math works in the card game Dobble (Spot It is the U.S. name), downloaded the dataset, installed the required libraries, and ran the script to look at the dataset and see some examples of how cards can be matched.
In this Part Two, we will explore how to train your own Dobble solving model using keras, a deep learning framework built on Tensorflow 2. To improve our model's accuracy, we'll also be using keras' built-in data generator to augment the dataset.
PrerequisitesIn part one, we set up everything that's required for this tutorial, including a bunch of Python libraries. Go back and do that if you haven't installed those yet.
Although not required, it's helpful to have a deck of Dobble or Spot-it cards to be able to test the model that you create, and optionally enhance the dataset with even more card images.
If you want to create your own dataset, it's also helpful to have Matlab installed (optional).
TRAIN THE MODELIn the previous tutorial, you downloaded the dobble card dataset. It contains 10 decks of 55 (or 57, in the case of deck 1) cards each.
We'll be using this data to train our model.
Download or clone the source code if you haven't already. Open the file 'dobble_tutorial.py'. We'll be running this code, and it takes several minutes, so might as well start it now. Run:
python dobble_tutorial.py
If you set up everything correctly, it should start running. It print the decks in the dataset, and then it will output a model summary, like this:
If you open the file, you will see the model structure defined on lines 135 - 149. We're doing image classification, so we're using a tried and true method of sandwiching subsampling MaxPooling layers between convolutional layers, adding a dropout layer to correct any overfitting, and finally flattening and adding the activation/dense layers.
Once the model has trained, another cool way to view your model is using an opensource software called Netron. The training script will output two files. As you can see from the code on line 182, the files are:
model.save_weights('dobble_model_weights.h5')
model.save('dobble_model.h5')
You can open both the model file and the weights file in Netron to visualize them. The model file includes the weights, as well as the structure of the model, while the weights file only includes the weights (you can also save just the structure in Keras as well, by using the model.to_json()
function).
Once training is done, the script tests the model on the test set and spits out an accuracy estimate:
./dobble_dataset/dobble_test01_cards : Test Accuracy =
0.9166666666666666
If you would like a more precise accuracy estimate, run:
python dobble_test.py
This script will give you a range of accuracies for our test set of 1, 200 cards. Accuracy in machine learning is never just a number. Depending on which card images happen to be in a test set, or in real life classification, a model might classify more or fewer of them correctly. It will give a totally different percentage for accuracy on every new test set it sees.
The true accuracy of the model is the distribution of accuracies over every possible card the model might see. There are infinite possibilities for different lighting, slightly different card rotations or angle the card is shown at, etc. For some of these images, the model will perform great! For others, not so much. Thanks to statistics, we can estimate the true accuracy using "error bounds" (aka margin of error).
For instance, we can say with 50% likelihood (this is known as the confidence interval) that the true accuracy is between 86% and 97%. This range is known as the "error bound." When presenting accuracies in papers and at conferences, people tend to use a 95% confidence interval.
The more cards in your test set, the more probable that the observed accuracy is close to the true accuracy, so your error bounds will become narrower.
Try running the scriptProbably the most useful test for accuracy in our case is testing it live by using the script dobble_detect_live.py but first we need to make sure the imutils Python library is installed.
pip3 install imutils
Open dobble_detect_live.py. Depending on whether you're using your built-in laptop camera or a webcam, uncomment line 48 or line 49 depending upon which camera you want to use for the computer vision input:
#input_video = 0 # laptop camera
input_video = 1 # USB webcam
Now run the live detection application (it will take a while to initialize the video stream, be patient):
python dobble_detect_live.py
You'll see a window pop up with your camera image, as well as some sliders that. You can place two or more cards in the camera frame and it will tell you which symbol on the card is the one the cards have in common. You'll notice that in live gameplay, it makes a lot of mistakes (so it's easier to play against! Bug or feature?).
Most machine learning models use thousands of images for each category they would like to classify. Our dataset only has 10 images of each card, so our poor model is doing its very best given the sparse dataset! In order to improve the accuracy of our model, we'll have to augment our dataset.
AUGMENTING THE DATAThere are basically two ways to augment our dataset. One of them is to add your own images. Another way is to generate totally new images from scratch.
1. Generating new imagesLet's start with the funnest and easiest method: augmenting our dataset automatically. To generate new images from scratch, you can use what's known as a generative adversarial network, or GAN.
Luckily, Keras as a built-in image generator class designed just for augmenting data! In fact, we're already using it in the training script, dobble_tutorial.py on line 176. It's currently randomly rotating each image in the dataset from 0 - 360 degrees, as well as rescaling the images. You can play with the values here to see if you get better accuracy.
Keras's built-in image generator generates images on the fly, so you can feed them into your model as you train it. However, I was unable to find a way to make it generate more images on the fly than are currently in your dataset (if you know how to do this, please let me know in the comments). What we really want to do is to generate an entirely new dataset with lots more images that will augment our current dataset, so I wrote a script to do just that!
Edit the augmentation scripts
Look in the dobble_dataset folder and select a card deck that you would like to augment, e.g. dobble_deck01_cards57. Edit the variable deck on line 12 of save_augmented_images.py to be the name of the deck that you would like to augment.
deck = 'dobble_deck01_cards_57'
Note: You can currently only augment one deck at a time.
If you would like to augment each image in the deck by more or fewer than 100 images, edit the variable total_images_to_augment. With the default of 100, this will generate 55 folders with 100 images in them (except for the first card deck, which will generate 57 folders). Run:
python save_augmented_images.py
It takes about 30 minutes to generate 100 images per card (without using a GPU). If you restart it after it has already generated new folders, Windows will give you an error, e.g.
PermissionError: [WinError 5] Access is denied: 'dobble_dataset/dobble_deck03_cards_55/augmented/'
Running it a second time will fix the problem.
The more augmented data we have, the more likely we will training to catch more of the corner cases that will be encountered in the real world. You can augment just one deck, but to improve your model accuracy go back and generate even more training data by augmenting each of the 10 training sets we provide. It will take a while, but in the end it will produce more satisfying results:
deck = 'dobble_deck02_cards_55'
python save_augmented_images.py
Repeat the above as desired, ideally for all 10 of the provided training decks.
Edit the augmentations:
Keras's built-in ImageDataGenerator() class has a number of techniques for augmenting the data. If you would like to change the amount or type of variation in each image, edit the arguments for the ImageDataGenerator object on line37. Here are the augmentation techniques that the script currently makes use of, in pictures:
There are also a number of techniques I didn't use - most because they didn't make sense for our dataset, and/or because I was too lazy.
- Rotation - has already been done on line 177
- Horizontal and vertical flip - not good for our dataset (also, currently doing on line 179?)
- featurewise_center
- samplewise_center
- featurewise_std_normalization
- zca_epsilon
- zca_whitening
- shear_range
- channel_shift_range
More details on each argument can be found in this tutorial from MachineLearningMastery and on the documentation page for the ImageDataGenerator class.
2. Adding your own datasetThis second technique for getting more data is very optional, as it's possible to get an accuracy over 99% using just the deck images provided if you augment them manually. This step requires both your own card deck. You can do it either with the Matlab Script, or right from the dobble_detect_live.py script.
Pro-Tip: Write the card number on the back of each card so you can easily remember which card is which as you take their glamour shots.
Usingthe Matlab Script to augment the deck
I'll go over the Matlab script first. It currently has a few quirks, but we believe you can figure it out!
- Open the Matlab script: /capture/matlab/dobble_capture_cards_03.m
- Open Add-On Explorer to install the Webcam Support Package.
- Run "webcamlist" in Matlab console.
>> webcamlist
- You should get back a list of webcam names, E.G.
ans =
2×1 cell array
{'HP HD Camera' }
{'HD Webcam C525'}
- Replace the webcam's name in the Matlab script with the name of your webcam.
- Replace the name of the directory to save to as well.
- Run the script. Point your webcam toward the dobble card you wish to take a photo of.
- To take a photo, make sure the edges of your card are fully inside the frame and wait until you see the blue square/red circle around the card.
- Click on the top edge of the image frame to take a photo (see image above). This is a bit finicky, so check your folders every now and then to make sure that the images are being taken correctly.
Using the Python script to capture new training data
By running the live detection script, we can utilize the frame capture feature built into this tool to capture image data out to a local image file in TIFF format:
python dobble_detect_live.py
Position your camera over your deck and snap a photo by pressing the 'w' key on your keyboard!
The pressing the 'w' key results in the application capturing the bounding red square box contents as a single local .tif image within the local dobble_buddy/output/ folder:
If you do take more training set decks, please share them with us, as every little bit of training data helps!
RETRAIN THE MODEL WITH AUGMENTED DATANote: Depending on how good of a computer you have, training could take days. It took me over 36 hours to train the first time I ran it. If you would prefer not to spend a day and a half on training, you can also find the pretrained model here.
1. Open dobble_tutorial.py and uncomment the augmented card decks on lines 50-62, and comment out the regular card decks from lines 37-48.
#card_decks = [
# 'dobble_deck01_cards_57',
# 'dobble_deck02_cards_55',
# 'dobble_deck03_cards_55',
# 'dobble_deck04_cards_55',
# 'dobble_deck05_cards_55',
# 'dobble_deck06_cards_55',
# 'dobble_deck07_cards_55',
# 'dobble_deck08_cards_55',
# 'dobble_deck09_cards_55',
# 'dobble_deck10_cards_55'
# ]
# augmented card decks
card_decks = [
'dobble_deck01_cards_57-augmented',
'dobble_deck02_cards_55-augmented',
'dobble_deck03_cards_55-augmented',
'dobble_deck04_cards_55-augmented',
'dobble_deck05_cards_55-augmented',
'dobble_deck06_cards_55-augmented',
'dobble_deck07_cards_55-augmented',
'dobble_deck08_cards_55-augmented',
'dobble_deck09_cards_55-augmented',
'dobble_deck10_cards_55-augmented'
]
2.Rename the models on lines 183 and 184. That way you won't overwrite the models you previously trained. Who knows, you might want to go back to that one if this more accurate one is too hard to play against!
model.save_weights('dobble_model_weights-new.h5')
model.save('dobble_model-new.h5')
3. Train away by running the tutorial script!
python dobble_tutorial.py
During training, Keras helpfully gives an ETA, as well as loss on the training set vs. the validation set. Showing the loss here helps us to know whether our model is overfitting, e.g. it is too well adapted for the training data, so it won't perform as well on real life data. If our training loss keeps going down, but validation loss starts going back up, we'll know that our model has overfit.
Thankfully we can see that our validation loss is consistently better than our training loss. This is because we add a 0.5% dropout layer. That means in each run, we remove half of the neurons in the network to help the model to generalize better. That explains why on the first epoch, the training loss is double that of the validation loss!
At the end of training, the script should spit out its final accuracy estimate again.
./dobble_dataset/dobble_test01_cards :
Test Accuracy = 0.9960536700868192
Remember, it's just one datapoint in a distribution, but our accuracy is already looking much better!
TEST THE FINAL MODEL ON THE GAMEReady to play Dobble against the AI you just trained?
If you saved your model as a different filename on lines 183 and 184 of dobble_tutorial.py so that you didn't overwrite the models that were previously trained, be sure to update the model filename that is loaded in by the dobble_detect_live.py application on line 88.
# Open dobble model
model = load_model('dobble_model-new.h5')
Run the live detect application:
python dobble_detect_live.py
Hold up two or more cards to the camera and see if you can correctly spot the matching symbol before the AI does!
Thanks for joining! In our next tutorials we'll be showing how to embed this model in specific hardware, like the MaaXBoard Mini and Ultra96 v2!
For running on MaaXBoard hardware, check out this next tutorial here:
https://www.hackster.io/aidventure/run-the-dobble-challenge-on-maaxboard-9cf300
For running on Ultra96 hardware, check out this next tutorial here:
https://www.hackster.io/aidventure/deploying-the-dobble-challenge-on-the-ultra96-v2-f5f78f
References
The Dobble Dataset
The Dobble dataset, available on kaggle:
The Dobble Challenge
Getting started with machine learning for the Dobble card game using TensorFlow.
Comments