Published May 27, 2020 © CC0

Ransom Note Letter Finder with OpenCV and Tesseract

Use OpenCV and Tesseract on MaaXBoard to find letters in magazine pages for your ransom note.

IntermediateFull instructions provided3 hours1,045

Things used in this project

Hardware components

Tria Technologies MaaxBoard

Tria Technologies Avnet 5V/3A USB Type-c power supply

Tria Technologies MaaXBoard 7" MIPI Display

Tria Technologies MaaXBoard Camera Module

Software apps and online services

OpenCV

Story

INTRO

Have you ever needed to create a ransom note, but you just didn't have the time?

Ransom notes are an age old media. Not just for criminals, they're also perfect for when you need to leave a passive aggressive note in your neighbor's mailbox, or for your roommate who keeps stealing your food in the communal fridge.

Unfortunately, it's time consuming to search for each letter on a page full of text...until now!

Finally you can create all the ransom notes you need, because MaaXBoard running pyTessearact and East quickly finds the letters for you! Busy dads and harried businesswomen will be ransoming in no time!

PREREQUISITES

See my previous tutorials to do the following:

1. Get MaaXBoard setup in headless mode (about an hour)

2. Install OpenCV (about 2 hours).

ASSEMBLE THE CAMERA AND SCREEN

If you have the MIPI-CSI Camera, plug it into the board, blue side facing the board's ethernet port.

If you're using the MIPI-CSI display, plug the cable into the board, blue side facing the edge of the board.

Other than that, you're basically done with the assembly. You just need to elevate the board so that the camera can see a full page of text. I just set mine an old shoebox.

GET THE MIPI-DISPLAY TO WORK (optional)

By default, the MaaXBoard is configured for HDMI, so you'll have to to edit uEnv.txt to get the display to work. Login via remote desktop or ssh using the "root" login (default password is "avnet").

Edit uEnv.txt using geany:

geany /boot/uEnv.txt

Comment out the first line if it's uncommented:

# fdt_file=em-sbc-imx8m.dtb

Uncomment the third line:

fdt_file=em-sbc-imx8m-dcss-dsi.dtb

GIVE YOURSELF CAMERA PERMISSIONS

MaaXBoard's linux version doesn't allow non-root users to access serial devices, like MIPI-CSI or USB cameras. You'll have to grant your user permissions.

Run:

ls -l /dev/video*

This will tell you the ownership and the permissions. You should now be able to run sudo chmod for any cameras you have to grant permission to non-root users:

sudo chmod -R a+rwx /dev/video1
sudo chmod -R a+rwx /dev/video0

Note: This change will only take be in effect until the MaaXBoard reboots. To create a more permanent change, while you're still logged in as root, edit /etc/rc.local file to include this line just before "exit 0":

sudo chmod -R a+rwx /dev/video0

Once you're finished, run:

sync
reboot

MOVE THE SOURCE CODE TO YOUR BOARD

Login to your MaaXBoard as the ebv user that you created in the OpenCV tutorial. Copy the zipped folder with the code and images from your host computer. Don't forget the colon at the end:

scp ransomnote.zip ebv@[IP ADDRESS]:

Enter the password ("ebv" if you setup your user according to the tutorial) when prompted.

Or use wget or curl (may need to install):

wget https://hacksterio.s3.amazonaws.com/uploads/attachments/1130505/ransomnote.zip

Connect as your "ebv" user via remote desktop and open Terminator, the terminal client for XFCE4.

Unzip the files that we just copied.

unzip ransomnote.zip

INSTALL TOOLS

We'll be using the Tesseract library for Optical Character Recognition (OCR). OCR is the conversion of typed, handwritten, or printed text into machine-encoded text. Tesseract is free and open source, and it's very configurable.

Install Tesseract:

sudo apt install tesseract-ocr

Verify your Tesseract version:

tesseract -v

To use Tesseract with Python, we need to install pytesseract. We'll install a couple other modules that we'll need while we're at it. Make sure you're on the virtual environment you set up that has OpenCV installed before installing:

workon cv
pip install pytesseract
pip install argparse
pip install pillow
pip install imutils

Test Tesseract on one of the test images:

cd ransomnote
tesseract test1.jpg stdout

You'll see an output of words that recognized Tesseract recognized:

RUN THE APP ON MAAXBOARD

Run the program on your cv virtual environment:

python ransomnote.py

This should start your board's webcam. It will continue to take pictures until you press ctrl-c to stop it.

HOW IT WORKS

Optical character recognition has been around since 1959. The first application was a banking application developed by Intelligent Machine Corporation that could only read one font in one size. Despite its age, the problem of optical character recognition still not "solved." According to this article,

Deep learning models find it much more difficult to recognize digits and letters than to recognize much more challenging and elaborate objects such as dogs, cats or humans.

What do we do with this? Is the spoon an "O"?

Recognizing text from a newspaper or magazine is extra difficult because of the highly complex layout, variable font sizes and font types, narrow space between lines, narrow gutters between columns, poor quality ink and paper, and missing text (probably due to previous letters being cut out for ransom notes).

East (Efficient accurate scene text detector)

This script uses East to find the regions of the image where text is. It's more accurate than YOLO or other single shot or region based detectors, but it's less computationally intensive than a sliding window text detector. Another benefit of East is that it's available in OpenCV 4.

However, it only finds regions where text is likely to be - it doesn't actually recognize the text, which is why we need Tesseract.

Tesseract

Tesseract was originally developed by HP between 1985 and 1994. It was open-sourced in 2005. Tesseract is good at recognizing text once it knows where it is. However, I found that it has trouble finding text regions on its own, which is why it needs to be used in conjunction with East.

Tesseract v4 uses an LSTM-based engine to recognize words and letters. An LSTM is a type of RNN (recurrent neural network). Basically, recurrent neural networks have loops that allow context from part of an image to be associated with context from the rest of the image. This makes this type of neural network good for text recognition because text-recognition involves a lot of context. For instance, characters like "l", "I" and "1" can look exactly alike, depending on the font. We need context about what word they appear in in order to parse them out.

Similarly, we need context about what words appeared previously in a sentence in order to guess the word that comes....(you guessed it)

... next!

This is also why you have to select a language as part of the config settings.

Tesseract Config

I encourage you to play with the config to see if you can get better results than I did.

One setting you can play with is Language(-l). Tesseract has been trained on many languages, from Amharic to Yiddish.
Engine Mode (--oem). Tesseract has several engine modes with different performance and speed. In order to run legacy mode you'll have to download the trained data for whichever language you've chosen.

Page Segmentation Mode (--psm). That affects how Tesseract splits image in lines of text and words. I found that 11 and 12 worked best for magazines.

Credits

Monica Houston

82 projects • 469 followers

I don't live on a boat anymore.

Ransom Note Letter Finder with OpenCV and Tesseract

Things used in this project

Hardware components

Software apps and online services

Story

INTRO

PREREQUISITES

ASSEMBLE THE CAMERA AND SCREEN

GET THE MIPI-DISPLAY TO WORK (optional)

GIVE YOURSELF CAMERA PERMISSIONS

MOVE THE SOURCE CODE TO YOUR BOARD

INSTALL TOOLS

RUN THE APP ON MAAXBOARD

HOW IT WORKS

East (Efficient accurate scene text detector)

Tesseract

Tesseract Config

Further Reading

Code

Ransomnote.py

Credits

Monica Houston

Comments

Embed the widget on your own site

Ransom Note Letter Finder with OpenCV and Tesseract

Ransom Note Letter Finder with OpenCV and Tesseract

Things used in this project

Hardware components

Software apps and online services

Story

INTRO

PREREQUISITES

ASSEMBLE THE CAMERA AND SCREEN

GET THE MIPI-DISPLAY TO WORK (optional)

GIVE YOURSELF CAMERA PERMISSIONS

MOVE THE SOURCE CODE TO YOUR BOARD

INSTALL TOOLS

RUN THE APP ON MAAXBOARD

HOW IT WORKS

East (Efficient accurate scene text detector)

Tesseract

Tesseract Config

Further Reading

Code

Ransomnote.py

Credits

Monica Houston

Comments

Related channels and tags