[okfn-labs] Best practice for OCR workflows (re OED 1st edition project)

Tom Morris tfmorris at gmail.com
Thu Jul 11 17:48:48 UTC 2013

The Early Modern OCR Project (EMOP) has some interest writeups of how they
use Tesseract for their work.  OED isn't as old (EMOP focuses on
1475-1800), but some of their notes may be useful in other contexts.


Sadly, one of the tools in their tool chain, Aletheia from
Salford/Manchester, is closed source, non-commercial only (despite being EU
funded research!), but EMOP plans to distribute their tools as open source
(not yet, but soon).


On Mon, Jun 24, 2013 at 11:40 AM, Tom Morris <tfmorris at gmail.com> wrote:

> On Mon, Jun 24, 2013 at 7:36 AM, Rufus Pollock <rufus.pollock at okfn.org>wrote:
>> On 21 June 2013 21:45, Tom Morris <tfmorris at gmail.com> wrote:
>> Is there a way to get the Abby version direct from the Archive online
>> or would one need to ask them specially?
> The Abby version is one of the formats in the directory.  Look for the
> file that ends _abby.gz  There's also a torrent containing all the files if
> that's easier.
> Tom
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.okfn.org/pipermail/okfn-labs/attachments/20130711/48ab3aba/attachment-0001.html>

More information about the okfn-labs mailing list