Is it possible to limit the set of characters that tesseract is looking for (e.g. search only for letters a-z)? That would improve my results greatly.
Limit characters tesseract is looking for
ocrtesseract
ocrtesseract
Is it possible to limit the set of characters that tesseract is looking for (e.g. search only for letters a-z)? That would improve my results greatly.
Best Answer
Create a config file (e.g "letters") in tessdata/configs directory - usually
/usr/share/tesseract/tessdata/configs
or
/usr/share/tesseract-ocr/tessdata/configs
And add this line to the config file:
...or maybe [a-z] works. I don't know. Then call tesseract similar to this:
That will limit tesseract to recognize only the wanted characters.