User login

OCR with tesseract: garbage output

Tue, 04/13/2010 - 08:37 - augustin

4-¤ cv ·—• g O Eq ."*§•**0""¤»·D. O ’“
¤¤;I¤; 4-» °¤cnE Eb D ¤ <¤5b¤.>.
$-4 ·¤·~¢W • O as
Oo 0 _o. _·O 0 ····-· _¤>*~=··»2°=$§»"’·~¤¤=&~~~¤2
C: :1 on s-4
E;] m..,§">mSE¤*—`§_'§.2u—'§.g§E~SE,-¤§§8 m=`?·*`°g
¤. ¤¤> =¤= own-· ¤.2.2E_¤ ¤¤=·*"=¤»¤>¤~·E¤¤-¤

What does it take for OCR to work?

Login or register to post comments

Comments

#1

augustin - 04/13/2010 - 08:55

en français, pourquoi j'en ai besoin:
http://3enjeux.overshoot.tv/billet/48

Login or register to post comments

#2

augustin - 04/13/2010 - 09:06

Oh! I found out why!

The tif image was scanned the wrong way around. I had to crop it, turn it and flatten it, following the instructions given in the ubuntu wiki:

The process to prepare them with GIMP is very simple:
Go to the Image→Mode menu and make sure the image is in RGB or Grayscale mode.
Select from the menu Tools→Color Tools→Threshold and choose an adequate threshold value.
Select from the menu Image→Mode→Indexed and from the options choose 1-bit and no dithering.
Save the image in TIFF format.

Complete this site's documentation.

Login or register to post comments

#3

augustin - 05/31/2010 - 12:31

Status:

active

» fixed

Login or register to post comments

#4

robot - 06/14/2010 - 14:50

Status:	fixed	» closed
Related pages:	-10: OCR - optical character recognition

Automatically closed -- issue fixed for 2 weeks with no activity.

Login or register to post comments

Project:	Linux software
Component:	Miscellaneous
Category:	support request
Priority:	normal
Assigned:	Unassigned
Status:	closed
Related pages:	#10: OCR - optical character recognition

User login

Tickets per project

OCR with tesseract: garbage output

Jump to:

Comments

#1

#2

#3

#4

Who's online