[open-humanities] First steps with OpenPhilosophy.org

Lars Aronsson lars at aronsson.se
Wed Dec 21 15:47:03 UTC 2011


On 12/21/2011 02:15 PM, Jonathan Gray wrote:
> does Wikisource
> now support being able to have image / transcription in parallel?

Yes, the step from just e-text (like Project Gutenberg) to scanned
images and transcription/proofreading is what I introduced in
Wikisource in 2005, rather than expanding Project Runeberg to
more languages than the Scandinavian ones, as I described here,
http://meta.wikimedia.org/wiki/User:LA2/Digitizing_books_with_MediaWiki

This is now the standard mode of operation in Wikisource.
It is handled by the ProofreadPage extension to the Mediawiki
software (which I didn't write). It uses a colour coding, where red
means raw OCR text that has not yet been proofread, yellow is
proofread once and green is validated by a second user.

The French Wikisource contains 344,211 proofread+validated
pages and the English comes in second at 193,289 pages.
If you only count validated pages, however, the German Wikisource
is leading with 108,469 pages. Here are some statistics,
http://toolserver.org/~phe/statistics.php

The first book with scanned images that I put up in October
2005 was a 5 volume encyclopedia, and not all of it has been
proofread yet. You can try it out by clicking "edit" here,
http://en.wikisource.org/wiki/Page:LA2-NSRW-3-0552.jpg

Every scanned book page has such a wiki page in the "Page"
namespace, and for each book there is also an overview or
"Index" page, e.g.
http://en.wikisource.org/wiki/Index:The_New_Student%27s_Reference_Work

These namespaces make it a bit more complicated than I would
have wished for. I tried to look at Scripto, but the only demo
I could find was the Papers of the War Department. Are there
other sites that use Scripto?


-- 
   Lars Aronsson (lars at aronsson.se)
   Aronsson Datateknik - http://aronsson.se

   Project Runeberg - free Nordic literature - http://runeberg.org/






More information about the open-humanities mailing list