[Coco] Re: [Color Computer] Requesting .pdf help...
Gene Heskett
gene.heskett at verizon.net
Mon Jul 18 12:37:23 EDT 2005
On Monday 18 July 2005 11:19, Michael Wayne Harwood wrote:
>I have written scripts that utilize imagemagick to eliminate
> yellowing and page clutter by doing the following:
>
>1. I create a second copy of the scanned image as a monochrome image
> with the black-threshold set to reduce as much "noise" as possible
> while retaining most of the "good" data of the original scan.
>2. I use the monochrome image as a mask against the original image
> by making the black areas transparent and overlaying the mask on
> top of the original scan. This cleans up a lot of the "noise" in
> the image (paper grain, yellowing, speckling etc).
You might investigate a function called 'coreing' where small changes
are removed by subbing a known good value for the pixels with the
dust spots etc. That way, if a color only changes 5-10% in
intensity, its removed and that can reduce the size of the compressed
image. Some programs like the gimp or imagemagick may call it
despeckle too, the functions are very similar. The gimp gives you a
speckle size control so it pulls the small stuff only.
Your skid marks etc on the test2.pdf cover are big enough to need to
grab a sample of the right color and overbrushing though. Tedious
work, leads to carpel tunnel syndrome here, so I don't do a lot of
that.
>3. I clean up the edges, reduce the colors in the image to 256,
> reduce the bit depth to 8, and create an LZW compressed .tif as a
> final result.
>
>An image processed with the above "clean" script looks very nice as
> a .tif, but pretty grainy with tons of artifacts when compressed to
> an acceptable size using Adobe's Acrobat v6. I know the .pdf
> standard allows for a lot more options than is presented in the
> Acrobat GUI. Perhaps converting the .tif files to postscript and
> processing them using ghostscript would be a better option, but
> this will not help with the OCR aspect. I can use ghostscript's
> ps2ascii.ps filter to pull hidden text (including glyph location on
> the images) but I have no idea how to import the info back in.
Neither do I, but it might be suitable for scripting as that could
take a lot of the tedium out of it. I'm sure there is a way to
re-import the result after its been checked for accuracy.
>Regards,
>Michael Harwood
>
>-----Original Message-----
>From: coco-bounces at maltedmedia.com
> [mailto:coco-bounces at maltedmedia.com] On Behalf Of James
> Diffendaffer
>Sent: Monday, July 18, 2005 8:56 AM
>To: ColorComputer at yahoogroups.com
>Subject: [Coco] Re: [Color Computer] Requesting .pdf help...
>
>The reason you are having to shrink the image down is because your
> scan has large areas of color that, look the same to the human eye,
> but it's composed of thousands of pixels differening in color.
>
>Since djvu approximates the image (lossy) it isn't a problem.
>PDF is trying to create vectors to store all of them (non-lossy).
>If Acrobat has some sort of option that eliminates this you'll have
> better luck.
>
>If you want to stick with the large file but get better compression
> you could pass the images through some sort of image processing
> that eliminates the isolated pixels. It would probably make both
> programs compress better. Smoothing filters do that. It's how they
> remove moles and freckles from cover models so they appear to have
> perfect skin. But it causes indiscriminant fuzzing of the image
> and those photos are edited by hand... not automated. I'm not sure
> what would be a good alternative.
>
>The large scale document archival systems I've worked with all
> expect B&W source images (order forms, membership applications,
> etc...) so I'm not sure what would work well for color.
>
>
>
>
>
>Brought to you by the 6809, the 6803 and their cousins!
>Yahoo! Groups Links
>
>
>
--
Cheers, Gene
"There are four boxes to be used in defense of liberty:
soap, ballot, jury, and ammo. Please use in that order."
-Ed Howdershelt (Author)
99.35% setiathome rank, not too shabby for a WV hillbilly
Yahoo.com and AOL/TW attorneys please note, additions to the above
message by Gene Heskett are:
Copyright 2005 by Maurice Eugene Heskett, all rights reserved.
Brought to you by the 6809, the 6803 and their cousins!
Yahoo! Groups Links
<*> To visit your group on the web, go to:
http://groups.yahoo.com/group/ColorComputer/
<*> To unsubscribe from this group, send an email to:
ColorComputer-unsubscribe at yahoogroups.com
<*> Your use of Yahoo! Groups is subject to:
http://docs.yahoo.com/info/terms/
More information about the Coco
mailing list