mass processing TIFF images: GIMP scripts

Project:Linux software
Component:Documentation
Category:support request
Priority:normal
Assigned:Unassigned
Status:closed
Related pages:#10: OCR - optical character recognition :-:-: #59: Image manipulation libraries
Description

I need to scan a whole book.
I'll scan each page, but after that I need to process in GIMP each page (crop, rotate, flatten, etc.)

Is there a way to script this so that it can be done for 200 files?

Does Gimp accept scripting?

Comments

#1

wiki.

#2

maybe if I find the proper settings for the scan, I may not need any pre-processing after all!
I'll check.

#3

See the following issue to see how image optimization is important:
#41: Image optimization before OCR

#4

`Scanning' with a Digital Camera
http://lists.alioth.debian.org/pipermail/sane-devel/2009-December/025634...

As I delve much deeper into the Linux scanner mess, the digital camera approach looks more and more tempting...

If I can automate the preparation of the images to ORC, then it does not matter what the source is of the pictures...

#5

#6

2colorthresh --- Automatically thresholds an image to binary (b/w) format using an adaptive spatial subdivision color reduction technique.

autowhite --- Automatically adjusts the white balance of an image.

bcimage --- Changes the brightness, contrast and/or saturation of an image.

exposure --- Changes the exposure level of an image.

fuzzythresh --- Automatically thresholds an image to binary (b/w) format using the fuzzy c-means technique.

isodatathresh --- Automatically thresholds an image to binary (b/w) format using the isodata technique.

isonoise --- Reduces isolated noise in an image.

localthresh --- Automatically thresholds an image to binary (b/w) format using a moving window adaptive thresholding approach.

otsuthresh --- Automatically thresholds an image to binary (b/w) format using Otsu's between class variance technique.

redist --- Modifies an image so that its (grayscale) histogram has a either Gaussian, distribution or a Uniform Distribution.

sahoothresh --- Automatically thresholds an image to binary (b/w) format using Sahoo's entropy technique.

textcleaner --- Processes a scanned document of text to clean the text background.

trianglethresh --- Automatically thresholds an image to binary (b/w) format using the triangle technique.

#7

I'm using Scan10005-cropped.jpg to test.

#8

First thing first: how to programmatically convert a .jpg into .tif?

#9

Related pages:+59: Image manipulation libraries

wiki.

#10

Easy OCR with ImageMagick and Tesseract-OCR
http://ubuntuforums.org/showthread.php?t=1370827

#11

Howto: Use OCR to convert PDFs to text
http://ubuntuforums.org/showthread.php?t=882899

#12

Status:active» fixed

added scripts to wiki.

#13

Status:fixed» closed
Related pages:-10: OCR - optical character recognition, -59: Image manipulation libraries

Automatically closed -- issue fixed for 2 weeks with no activity.