[okfn-labs] Best practice for OCR workflows (re OED 1st edition project)
tfmorris at gmail.com
Thu Jul 11 17:48:48 UTC 2013
The Early Modern OCR Project (EMOP) has some interest writeups of how they
use Tesseract for their work. OED isn't as old (EMOP focuses on
1475-1800), but some of their notes may be useful in other contexts.
Sadly, one of the tools in their tool chain, Aletheia from
Salford/Manchester, is closed source, non-commercial only (despite being EU
funded research!), but EMOP plans to distribute their tools as open source
(not yet, but soon).
On Mon, Jun 24, 2013 at 11:40 AM, Tom Morris <tfmorris at gmail.com> wrote:
> On Mon, Jun 24, 2013 at 7:36 AM, Rufus Pollock <rufus.pollock at okfn.org>wrote:
>> On 21 June 2013 21:45, Tom Morris <tfmorris at gmail.com> wrote:
>> Is there a way to get the Abby version direct from the Archive online
>> or would one need to ask them specially?
> The Abby version is one of the formats in the directory. Look for the
> file that ends _abby.gz There's also a torrent containing all the files if
> that's easier.
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the okfn-labs