[open-humanities] [okfn-labs] Best practice for OCR workflows (re OED 1st edition project)

Tim McNamara paperless at timmcnamara.co.nz
Sat Aug 24 11:11:01 UTC 2013


I guess it all depends on making it easy to use. Lots of libraries (
http://rose-holley.blogspot.co.nz/2013/04/crowdsourcing-text-correction-and.html)
have been extremely successful allowing volunteers to correct OCR mistakes.

The Australian National Library's Trove (http://trove.nla.gov.au/newspaper)
is received over 100,000 corrections today! ANL seem to be using ABBYY
FineReader as well (http://www.nla.gov.au/content/ocr-overview).

Veridian (http://veridiansoftware.com/) appears to be the tool of choice.


On 24 August 2013 22:28, Jonathan Gray <jonathan.gray at okfn.org> wrote:

> Regarding plans for an open version of the 1st edition of the OED, I
> thought some of you might be interested in this piece from Cory Doctorow
> yesterday:
>
>
> http://www.theguardian.com/technology/2013/aug/23/oxford-english-dictionary-future-digitally
>
> What do we need to move forward with an Open OED project [1]? It would be
> really cool if there were any way to break the dictionary down into entries
> that people could help to proofread and correct. Any thoughts on that
> front? Anyone else interested in helping?
>
> If there were any simple tasks that people could do, Adam said he could
> help publicise this to readers of the Public Domain Review.
>
> [1] https://github.com/okfn/oed
>
>
> On 24 June 2013 17:40, Tom Morris <tfmorris at gmail.com> wrote:
>
>>
>> On Mon, Jun 24, 2013 at 7:36 AM, Rufus Pollock <rufus.pollock at okfn.org>wrote:
>>
>>> On 21 June 2013 21:45, Tom Morris <tfmorris at gmail.com> wrote:
>>>
>>> Is there a way to get the Abby version direct from the Archive online
>>> or would one need to ask them specially?
>>>
>>
>> The Abby version is one of the formats in the directory.  Look for the
>> file that ends _abby.gz  There's also a torrent containing all the files if
>> that's easier.
>>
>> Tom
>>
>>
>> _______________________________________________
>> okfn-labs mailing list
>> okfn-labs at lists.okfn.org
>> http://lists.okfn.org/mailman/listinfo/okfn-labs
>> Unsubscribe: http://lists.okfn.org/mailman/options/okfn-labs
>>
>>
>
>
> --
>
> Jonathan Gray
>
> Director of Policy and Ideas  | *@jwyg <https://twitter.com/jwyg>*
>
> The Open Knowledge Foundation <http://okfn.org/>
> *
>
> Empowering through Open Knowledge
>
> okfn.org  |  @okfn <http://twitter.com/OKFN>  |  OKF on Facebook<https://www.facebook.com/OKFNetwork> |
> Blog <http://blog.okfn.org/>  |  Newsletter<http://okfn.org/about/newsletter>
> *
>
> _______________________________________________
> okfn-labs mailing list
> okfn-labs at lists.okfn.org
> http://lists.okfn.org/mailman/listinfo/okfn-labs
> Unsubscribe: http://lists.okfn.org/mailman/options/okfn-labs
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.okfn.org/pipermail/open-humanities/attachments/20130824/8e04321c/attachment-0001.html>


More information about the open-humanities mailing list