[Coco] Re: [Color Computer] Requesting .pdf help...

Michael Wayne Harwood michael at musicheadproductions.org
Mon Jul 18 08:51:03 EDT 2005


test2.pdf is a multipage .pdf file with 10 test pages.  To get this
quality I had to resize the image to 100ppi (782x1082 in this case) before
I published to .pdf.  A source image at 100ppi does not go though most OCR
engines very well.

Let me clarify my intentions a bit...   A lot has been said about the pros
and cons of the .djvu format vs the .pdf format, and I am not trying to
re-open this debate.  However I am finding that I am consitently getting a
higher quality with the .djvu format that has extras like searchable text,
while I am not able to figure out how to get the same level of quality of
features in the .pdf format.

What I would like is for someone to assist me in finding a way to pass a
2550x3300 image through an OCR engine with the end product being a .pdf
with searchable text (not perfect, but useful) that ends up with an
average size of 160kb or less per page.  All of this needs to be done with
either open sourced or freeware tools rather than a $400 publishing or OCR
 package.

If I am not able to figure out how to make .pdf jump through the same
hoops I can get .djvu to jump through I am going to publish exclusively to
.djvu rather than have two formats that vary in quality and features.

Let me reiterate that I am NOT trying to re-open the .djvu vs .pdf debate
- please focus your replies on advice and practical examples of how to
accomplish these goals in the .pdf format.

Regards,
Michael Harwood



> A cover page is not a very good test page.  I mean thats a great .pdf,
> I blew it up to 800% before the effects of the dct could be seen on
> screen here, but its also 1.5 megs because its the cover of the rag
> and all image, in full color, but there is nothing there that would
> make OCR'ing it worthwhile.  To see what it will do, take a random
> page from inside the magazine thats >60% plain text.  Page 2 of a
> multipage article would be a good test page.
>




Brought to you by the 6809, the 6803 and their cousins! 
Yahoo! Groups Links

<*> To visit your group on the web, go to:
    http://groups.yahoo.com/group/ColorComputer/

<*> To unsubscribe from this group, send an email to:
    ColorComputer-unsubscribe at yahoogroups.com

<*> Your use of Yahoo! Groups is subject to:
    http://docs.yahoo.com/info/terms/
 





More information about the Coco mailing list