Optical Character Recognition - OCR tools

Matrox Imaging Library - Reading a rotated string in the image … Reading vertical strings …Reading dot-formed characters … Using grammar to reduce the number of possibilities … Demystifying Fontless String Reader: “Fontless, but not Any-Font!”

Matrox MIL Optical Character Recognition

Image with rotated string (left) and processed to determine string angle (right).

How to read a vertical string: string to read , font with characters rotated by 90° and string to read rotated by 90°

(PresseBox) (München, 08/27/2010)

Character strings are everywhere. We are constantly exposed to text in our books and newspapers and the labels on consumer products. It is not surprising, then, that many machine vision applications require some form of character recognition. Although it may seem quite simple to read characters, it actually requires highly-evolved algorithms that match our own reading abilities. To simplify the work for application developers, Matrox Imaging offers two easy-to-use character recognition tools that serve industry’s needs.
OCR (Optical Character Recognition) tool: Template-based OCR is a fast and robust grayscale tool that is able to read degraded text. It works best with strings of a known size and length as well as constant spacing. It is mostly used in semiconductor and identification applications such as IC and wafer marking.
String Reader tool: Feature-based String Reader is a flexible and robust tool that is able to read strings of unknown lengths and composed of characters from different fonts. It can handle scale and aspect ratio, and also tolerate some skew and rotation. String Reader works as long as each character can be locally segmented from the background and neighboring characters. It is used for reading product SKUs, lot/expiry codes and ANPR (Automatic Number Plate Recognition) applications.
This article offers some tips to improve character recognition performance using Matrox Imaging software.

Reading a rotated string in the image
Reading a rotated string can be a challenge. Not all OCR tools can handle rotation, and those that do can consume a significant amount of processing power when the angle is left unspecified. In this case, implementing a custom string localization algorithm can speed up the reading, or even make the read operation possible.

Here is practical example of a custom string localization algorithm:
1. Perform morphological operations to merge characters (typically opening or closing operations, depending on the string’s foreground color).
2. Binarize the image.
3. Calculate the areas, diameters and angles (minimum Feret) and the positions (i.e., centers of gravity) of all the resulting blobs.
4. For each blob, calculate the diameter of the Feret perpendicular to the minimum Feret angle.
5. Isolate the blobs that are likely to be strings. A string should satisfy the following criteria:

The custom string localization algorithm returns the approximate positions of the blobs’ rotated bounding boxes that enclose possible strings. The next step is to use the blob’s center of gravity as the center of rotation for the original image. With this image, you can go on to read the string contained within the ROI based on the blob’s bounding box. Be careful – the string can be upside down after the process. A second reading at 180° may be necessary.

Additional tips:
- If the characters are large enough, the string localization process can be performed in a subsampled image to speed up the process even further.
- If using the OCR tool, the rotation of the image can be specified in the context settings.
- A second read (at 180°) can be avoided by specifying an upside-down font. The resulting string can be manually inverted afterwardsif necessary. Be careful though, some characters are symmetrical and could give confusing results. For example, 909II6 could also be interpreted as 9II606!

Reading vertical strings
Both Matrox Imaging tools are restricted to horizontal strings. However, it can be quite easy to read a vertical string. Simply rotate the characters in the font by 90°, and adjust the character’s baseline settings. Then read in a rotated version of the original image.

Reading dot-formed characters
Even if reading dot-formed characters looks the same as reading solid-stroke characters to us, it is a completely different story for a computer. Reading tools are usually designed for solid-stroke characters, but dot-formed characters are made up of multiple small disconnected blobs. Furthermore, the printing process can create unpredictable non-linear distortions, so reading dot-formed text usually requires an unconventional approach.
lying simple pre-processing operations to dot-formed characters can turn them into solid-stroke characters (Figure 4). A simple morphological operation (typically opening or closing) merges the dots, and effectively reconstructs the characters so they can be easily read by Matrox Imaging’s tools.
For example, if we try to read the slightly stretched dot-formed E (Figure 3) using the OCR tool, we would get a poor reading score because the dots of the model character would be interleaved with the target character.
While such pre-processing tricks can make it possible to read dot-formed characters, some limitations may surface which will have an impact on the success of the reading operation. For example, if one or more dots are missing from the character (Figure 5), it is possible that the character won’t be reconstructed completely.
When the characters’ spacing is equal to or smaller than the maximum allowable distance between the dots, the morphology operation can merge some characters together (Figure 6).
Depending on the severity and the quantity of the anomalies, the String Reader tool can handle these situations to a certain extent, as long as most of the string’s characters are reconstructed correctly. In contrast, if its capabilities are respected, the OCR tool remains largely unaffected because it relies on grayscale correlation, making it the right tool to use when these anomalies are expected.

Using grammar to reduce the number of possibilities
You can improve the robustness of the read operation by specifying grammar constraints. Specifying grammar rules is a good habit because the constraints considerably reduce the number of combinations to test during the read. They also enable the module to reject false matches more easily, especially for very similar characters like 2 and z, 1 and l, 0 and O, or 8 and B. Here’s an example of how grammar constraints can improve your application’s performance.

A Netherland license plate can have the possible combination of digits (i.e., D) and letters (i.e., L):
LL-DD-DD or DD-DD-LL or DD-LL-DD or LL-DD-LL or LL-LL-DD or DD-LL-LL or DD-LLL-D or LL-DDD-L or LL-DDD-LL

This gives us over 200 million six-character strings out of two billion, or approximately 10%. The string reading will be more robust since the constraints rule out falsely matched characters. Suppose we are trying to read a plate with the following licence: 61-AD-54. In some fonts, characters I and 1 are almost identical, so it wouldn’t be surprising to read 6I-AD-54. But with the grammar constraints, String Reader rejects this combination and replaces the I for a 1, the most likely character that satisfies the grammar constraint.

Demystifying Fontless String Reader: “Fontless, but not Any-Font!”
MIL 9 Processing Pack 1 introduced a new kind of String Reader context: a fontless context. Even though fontless might sound like the context can read strings using any font, it is not exactly the case.

Unlike standard String Reader contexts, the fontless context is not based on a fixed character representation. The fontless context is built on an abstract model of the character that comes from a machine learning process performed with a given set of fonts. In other words, the fontless context has been trained with several fonts. Font training introduces additional reading flexibility; with the fontless context, a read operation can recognize similar characters as opposed to just identical characters with a font-based context. Note that characters which are completely different from those used in the training are unlikely to be read. MIL 9 Processing Pack 1 includes three fontless contexts (two for ANPR applications and one for general purpose machine print reading), and application-specific fontless contexts can be trained by Matrox Imaging as needed.
Even though fontless String Reader gives good results, operations will usually give better general performance when the font is known in advance. If you have valuable information at hand to improve results, you’re always better off using it! But when defining a font is impractical or even impossible, a String Reader fontless context lets us tackle the application. A fontless context can also complement a font-based context, because it covers potential unknown fonts that may show up in the process.

Link to Matrox Imaging Library Overview:
http://www.rauscher.de/...

For more information please contact:

RAUSCHER
Johann-G.Gutenberg-Str. 20
D-82140 Olching

Phone +49 81 42 / 4 48 41-0
Fax +49 81 42 / 4 48 41-90

E-Mail: info@rauscher.de
www.rauscher.de