[okfn-help] Re: Fwd: [okfn-discuss] Project updates: Open Shakespeare

Rufus Pollock rufus.pollock at okfn.org
Mon Apr 2 10:54:30 BST 2007

Moving this to okfn-help as this is not general interest but project 

Oldak Quill wrote:
> On 21/03/07, Rufus Pollock <rufus.pollock at okfn.org> wrote:
>> I've only been able to find the first two volumes via google -- do you
>> know where I can vols 3-8? I should also say that we are only interested
>> in 10 pages or so of that particular volume large work so it might we
>> worth just doing it ourselves (or is there a way to suggest just a small
>> section to the distributed proof reading project). We also need to do
>> the OCRing as text on wikisource is just the scan (from a look around
>> pgdp.net it seems you need to scan it before it gets to them).
> http://www.pgdp.net/phpBB2/viewtopic.php?t=3915 - this is the forum
> thread concerning PGDP's Encyclopaedia Britannica "Uberproject".

Thanks for the link. The delay in response is due to delay taken in 
signing up and confirming my account!

> I do know that the 8th volume (Dubois to Dyeing) is in the first
> proofreading stage. This means that the first 7 volumes have completed
> this stage, but not necessarily that they have been finally released.
> Texts which have only partially been proofread are still accessable
> and of a useable standard (you might need to do a little formatting).

Shakespeare is in vol. 24 :( (full details on 

I note, looking at the first entry in the pgdp thread that there are 
some specialised sections (e.g. Condensation of Gases: 
http://www.pgdp.net/phpBB2/viewtopic.php?t=20120) which have been broken 
out of the main flow so perhaps Shakespeare could fit in as one of them?

> If the texts you are looking for is in the first eight volumes, I can
> help you to find the exact articles by looking through project pages
> and whatnot. If it isn't in the first eight, the best way forward
> would be just to proofread the pages you need yourself. I'd also
> happily help with this.

That would be great. First thing we need to do is ocr the tiffs from 
wikisource. Do you have suggestions for open source ocr tools?


More information about the okfn-help mailing list