[humanities-dev] [open-humanities] Call for Participation: Open Source Indexing

Jonathan Gray jonathan.gray at okfn.org
Wed Apr 3 17:53:12 UTC 2013

This looks very interesting indeed! Wonder if there might be possible
synergies with our Crowdcrafting project (cc'ing Daniel, the main



On 2 April 2013 19:45, Ben Brumfield <benwbrum at gmail.com> wrote:

> (Extracted from http://opensourceindexing.org/ which has more details.
>  Worth mentioning is that we're trying to operate under open data
> principles as much as Freedom Zero will allow.)
> The Challenge
> Historic documents often contain handwriting, old fonts, or other text
> formats that OCR software can't handle. We need humans--from
> volunteers to paid staff--to read the document images and transcribe
> what they see into databases which can be searched, analyzed, crawled,
> and used by researchers. Until now those efforts have required
> organizations either to outsource indexing to external partners or to
> cobble together their own off-line or on-site systems.
> Our goal is to build a tool that can be used by libraries, archives,
> museums, historical sites, genealogy and heritage societies to run
> their own indexing projects, under their own control.
> The Invitation
> We'd like to invite libraries, archives, and museums; historical,
> genealogy, and heritage societies to participate in the project. Right
> now we need advice and examples of indexing projects that real
> organizations would like to run. This would allow us to work with an
> eye on real data outside the UK parish registers and English census
> records which have been driving our development up to the present.
> What we need from you
> Project definitions including:
>   *  Sample image files (around 5 per project in the format you'd use
> for access copies),
>   *  A maximal spec for the data you'd like to collect,
>   *  A minimal set of required fields you need, and
>   *  A description of the material and goals of the project.
> In addition to example indexing project definitions, we need:
>   *  Funding to continue development. Our top priority is building a
> tool for our funders' indexing projects at FreeREG and FreeCEN.
> Building features outside of the needs common to those projects will
> require more funds.
>   *  Code contributions and help with design and programming.
>   *  Publicity and endorsement to spread the word about Open Source
> Indexing.
> The Tool
> We're basing our online indexing tool on Scribe, a tool developed by
> the Citizen Science Alliance from their Old Weather project and
> deployed by the Bodleian Library for What's the score at the Bodleian.
> More recently, Scribe has been customized by New York Public Library
> Labs for their Ensemble database of the performing arts.
> We're augmenting the Scribe transcription system by adding a database
> that allows users to search and view records created by the indexing
> tool. We're also adding support for and offline/legacy transcripts
> imported via CSV files. Improvements to the data-entry UI and a system
> for reporting on indexing activity and managing volunteers will round
> out the effort. (See the data flow diagram.)
> The entire system will be released under an Apache license. (In fact,
> the source code under development already is.)
> For more information, contact
> Ben Brumfield
> benwbrum at gmail.com
> http://manuscripttranscription.blogspot.com/
> _______________________________________________
> open-humanities mailing list
> open-humanities at lists.okfn.org
> http://lists.okfn.org/mailman/listinfo/open-humanities
> Unsubscribe: http://lists.okfn.org/mailman/options/open-humanities

Jonathan Gray <http://jonathangray.org/> | @jwyg <http://twitter.com/jwyg>
Director of Policy and Ideas
The Open Knowledge Foundation <http://okfn.org/> |
Support our work: okfn.org/support
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.okfn.org/pipermail/humanities-dev/attachments/20130403/a1acad2d/attachment.html>

More information about the humanities-dev mailing list