python 2.7 - Is there a character or file size limit for tesseract-ocr output? -


i using raspberry pi 3b , python image processing on image captured using raspberry pi camera. here original image: https://drive.google.com/open?id=0bxm5mqbqj3wvmhb2vmdzddbyaxm

this image after rotating , cropping given input tesseract: https://drive.google.com/open?id=0bxm5mqbqj3wvvu5hm2t6afz5rue

after running tesseract-ocr, recognizes 2/3rd of image reasonable accuracy totally leaves out last part. due file size limitation or there other reason?

this text after running tesseract:

instmrnentntlun collective term measuring instruments used indicating. measuring , recording physical quantities.

the term instrumentation may refer simple direct on reading thermometers or. when using many sensors, may {mvmm complex industrial control system in such manufacturing 1 ry, a". . , transportation. lnstrutnentation can found in househo w .

a smoke detector or heating thermostat examples.

in cases sensor minor element of mechanism. digital cameras , wristwatches might technically meet loose definition of instrumentation because record and/or display sensed information. under circumstances neither called instrumentation, when used measure elapsed time of race , document winner @ finish line, both called instrumentation.

household

a simple example of instrumentation system a

mechanical thermostat, used control household fumace , control room temperature. typical unit senses temperature bi-metallic strip. displays temperature needle on free end of strip. activates furnace mercury switch. switch rotated strip, mercury makes physical (and electrical) contact between electrodes.

another example of instrumentation system home security system. such system consists of sensors (motion detection, switches detect door openings), simple algorithms detect intrusion, local control (arm/disarm) and

remote monitoring of system police can summoned. communication inherent part ofthe design.

automotive

share|improve question
    
if there such limitation, image near :) think remaining text skewed. try manually deskewing remaining text - , leave rest of image untouched. see if there different result. – mihai ovidiu drăgoi apr 6 @ 10:12
    
thanks reply. found online tesseract can manage skew angle of 10-15 degrees. that's why never considered necessity de-skew. we'll try , you. – roshini apr 6 @ 10:23
    
we cropped out part , de-skewed it, output has improved greatly. off 2 degrees. need include de-skewing in our pre-processing. lot. – roshini apr 6 @ 10:49
    
added answer. glad helped! – mihai ovidiu drăgoi apr 6 @ 10:57
up vote 1 down vote accepted

if there such limitation, image near it. think remaining text skewed. try manually deskewing remaining text - , leave rest of image untouched.

while tesseract should work higher skew angles, fact skew varies per paragraph (in example) might make leave out final one.

share|improve answer

your answer

 
discard

posting answer, agree privacy policy , terms of service.

not answer you're looking for? browse other questions tagged or ask own question.

Comments