[Coco] Re: Rainbow on Disc - OCR

Michael Wayne Harwood michael at musicheadproductions.org
Fri Jun 10 11:25:48 EDT 2005


John,

You make some excellent points!  Would you be willing to lead the charge
in investigating and organizing what would be required to move forward
with this?  I think that before we start scanning magazines en masse we
should look into the minimal requirements needed a successful OCR project.


Regards,
Michael Harwood


> Actually, I don't think the OCR is a risk.
>
> Think of it this way: we have a bunch of scanned pages, I filter them to
> black and white, and run OCR on them. This is a batch operation so it
> doesn't take anyone much time.
>
> Proofreading work can be done be people who don't even have a scanner,
> so we have the possibility of bringing in many more volunteers. That
> means we're even more scalable than the scanning work. So we'd probably
> be done with OCR at about the same time as scanning work in general is
> done, so no work would be delayed by it.
>
> I really think it should be brought into scope, considering the clear
> utility of such a resource (grepable Rainbow, cool...) and the fact that
> it's not a hard thing to do (done it before on Thinking Forth).
>
> Just raw ascii text.No doing a repub or anything seriously hard like
> that.
>
> There are probably a few of us who could take on this aspect of the
> project if you want to split the production work between OCR and
> scanning.
>
> -- John.
>





More information about the Coco mailing list