[Coco] Rainbow on Disc - OCR

Dean Leiber adit at 1stconnect.com
Sun Jun 12 03:46:00 EDT 2005


>I personally think that including the text in this manner is a higher
>priority than making an all inclusive PDF.  I am not saying we shouldn't
>do an all inclusive text enabled PDF, just that I think that it's lower on
>the list of priorities.

Well, I guess I'll put my 2 cents in here. Since I have older systems 
(MacOS8.6-9.2) it should probably be taken with a grain of salt. I have 
all 3 major OCR programs that were available on the Mac (pre system X) 
and they all leave alot to be desired. These types of programs have 
hopefully gotten better since then  but I found that they are highly 
sensitive to point size, font, skew and obviously, quality of scan. I'm 
sure color will also throw in a whole new dynamic as well. Some of these 
programs may also have a limit on the size of the DPI of the scan/tiff 
(something to keep in mind.) Expect to have all kinds of recognition 
problems with 'zero' and the letter O, i,l, and the number 1. 
Interestingly, the sequence of letters 'in' together often OCR as 'm.' 
The fun is neverending. This will be very fun to proof programs. I just 
thought I'd mention this since everyone seems to think OCRing is going to 
be this painless process. It'll probably be a nice thing to have, but 
remember that it will take time and effort;Most likely more than 
scanning/'PDFing', if you include the proofreading.




More information about the Coco mailing list