[Coco] Rainbow archives in DjVu

Jeff Teunissen deek at d2dc.net
Wed Mar 25 15:30:08 EDT 2009


Bill wrote:
> You are right. There doesn't happen to be a converter from PDF to DjVu does
> there? I've got Gigs of Rainbow Magazines in PDF format that I'd LOVE to
> convert.

I'm making the converter I'm using to convert Rainbow PDFs to DjVu, but it
only works in Linux and needs a lot of hand-holding. For example, I'm about
half-way done processing the January 1991 issue right now and there are four
pages that will need manual tweaking to get right.

There are some problems I can do nothing about, and some I might be able to fix.

In the category of "I can't fix this" are most of the later "tabloid" editions
of The Rainbow. The scans of these issues seem to have been processed fairly
heavily (probably to make them readable), and while I can produce serviceable
DjVu files the text quality isn't very good due to the amount of gray. My OCR
process will work fine on them, and if you switch to "Stencil" mode they look
pretty good, but in normal "Color" mode they're hard to read. I've done the
best I could, but either someone will need to use a paint program to
reconstruct the text (a pain in the butt) or we locate new scans.

In the same category are the scans in which only the pages with colored ink on
them were scanned in color, while the rest were done in pure black and white.
Even a black and white magazine page looks better when scanned in color,
because without it we lose the page texture, the aging of the paper, the gray
half-toning the printer used for all the things that weren't text, etc. I CAN
still convert these, but color is almost free in DjVu so there's very little
reason to save on color. Anyone have better scans of these? I would MUCH
rather have all pages in color at the highest resolution available.

In the "maybe I can fix this" category are the scans that were made in
horizontal "strips" for faster loading in a PDF viewer. These don't stitch
together very well with my system so far. I'm working on it, but there are
still problems so I'm not uploading those anywhere yet.




More information about the Coco mailing list