OCR Segmenter Tool Limitations
The following limitations apply to the OCR Segmenter tool:
- Backgrounds with strong textures and/or with so much noise that the character blends into the background may not be handled well.
- The ROI must not contain strong features other than the line of characters (for example, no other lines of characters, no label edges, and so on). In some cases, this may require precise fixturing. The OCR Segmenter tool can sometimes be made to reject other strong features using the parameters ignoreBorderFragments and/or characterMaxHeight.
- Touching characters can be handled to some degree but typically require you to adjust some parameters. Fixed-width touching characters can usually be handled by specifying the width; however, proportional fonts with touching characters are problematic, the OCR Segmenter tool may handle some cases correctly, but there may be some cases which cannot be handled correctly by the OCR Segmenter tool.
- Extraneous scratches or strokes (such as handwritten scribbles that might pass through characters on a bank check) may not be handleable by the OCR Segmenter tool.
- Currently, no template-based segmenter is available. Such a segmenter may be able to handle touching characters and extraneous strokes, but it would have other limitations.
- Line rotation can be determined, but for very short lines (for example, three or fewer characters) or relatively short lines with a lot of line jitter, it may be necessary for you to specify the rotation instead (because of the inherent uncertainty in determining the orientation of a short line).
- All characters in one line of characters must have the same rotation.
- All characters in one line of characters must have the same skew.
- For well-separated dot-matrix print (that is, where the dots are not touching), it is more likely that you will need to specify some parameters (such as expected character size and/or minimum inter-character gap) to get a good segmentation.
- Don't-care masks are not supported.
- Nonlinear client coordinates are not supported.
- Image size is limited only by physical memory (for example, images greater than 2 GB are supported on 64-bit systems).
- The stroke width must be greater than or equal to two pixels. For the definition of stroke width, see section Font Stroke Width.
- The minimum character size is 8x8 pixels for large (typically alphanumeric) characters and 2x2 pixels for small characters (such as periods).