Python – Make tesseract recognise numbers only

ocrpythontesseract

I am trying to refine an OCR prog I made to read the layout of a certain image that I am using. Right now, I would like my OCR prog to recognise only digits 0-9.

I tried to follow the solution from the question:

Limit characters tesseract is looking for

But I got stuck in the part where I have to call tesseract as:

tesseract input.tif output nobatch letters  

where does this go?

Best Answer

i had the same issue using python, wit tesseract 3 Assuming further readers may do so.

from here : https://github.com/tesseract-ocr/tesseract/wiki/FAQ#how-do-i-recognize-only-digits

and here: https://github.com/madmaze/pytesseract/blob/27fed535bf1eb665ec991313841b177336b50f61/src/pytesseract.py#L91

i succeeded using :

pytesseract.image_to_string(someimage, config='outputbase digits')

Related Topic