mass processing TIFF images: GIMP scripts
|Related pages:||#10: OCR - optical character recognition :-:-: #59: Image manipulation libraries|
I need to scan a whole book.
I'll scan each page, but after that I need to process in GIMP each page (crop, rotate, flatten, etc.)
Is there a way to script this so that it can be done for 200 files?
Does Gimp accept scripting?
maybe if I find the proper settings for the scan, I may not need any pre-processing after all!
See the following issue to see how image optimization is important:
#41: Image optimization before OCR
`Scanning' with a Digital Camera
As I delve much deeper into the Linux scanner mess, the digital camera approach looks more and more tempting...
If I can automate the preparation of the images to ORC, then it does not matter what the source is of the pictures...
2colorthresh --- Automatically thresholds an image to binary (b/w) format using an adaptive spatial subdivision color reduction technique.
autowhite --- Automatically adjusts the white balance of an image.
bcimage --- Changes the brightness, contrast and/or saturation of an image.
exposure --- Changes the exposure level of an image.
fuzzythresh --- Automatically thresholds an image to binary (b/w) format using the fuzzy c-means technique.
isodatathresh --- Automatically thresholds an image to binary (b/w) format using the isodata technique.
isonoise --- Reduces isolated noise in an image.
localthresh --- Automatically thresholds an image to binary (b/w) format using a moving window adaptive thresholding approach.
otsuthresh --- Automatically thresholds an image to binary (b/w) format using Otsu's between class variance technique.
redist --- Modifies an image so that its (grayscale) histogram has a either Gaussian, distribution or a Uniform Distribution.
sahoothresh --- Automatically thresholds an image to binary (b/w) format using Sahoo's entropy technique.
textcleaner --- Processes a scanned document of text to clean the text background.
trianglethresh --- Automatically thresholds an image to binary (b/w) format using the triangle technique.
I'm using Scan10005-cropped.jpg to test.
First thing first: how to programmatically convert a .jpg into .tif?
Easy OCR with ImageMagick and Tesseract-OCR
Howto: Use OCR to convert PDFs to text
added scripts to wiki.
Automatically closed -- issue fixed for 2 weeks with no activity.