[okfn-discuss] Historical British Statutes
John Levin
john at technolalia.org
Tue Nov 1 13:24:06 UTC 2016
Dear all,
I have started a wildly ambitious project to get the entirety of English
/ British statutes online, in plain text and under a public domain mark.
http://statutes.org.uk/
https://github.com/Anterotesis/statutes
The idea is to use OCR the numerous volumes of statutes digitized by
Google et al, correct the resultant alphanumeric soup into something
more useable, produce versions suitable for both humans and machines,
and organize the texts in such a way that they can be easily discovered,
searched and examined.
And without spending the rest of my life proofing these horrendously
turgid texts.
It is of course early days, and so far I have simply been OCRing; nearly
100 vols are up on github, with more to come. The OCR is poor, of
course, but tests with correcting and normalizing software have been
encouraging.
Any ideas, interest, etc, welcomed.
John
--
John Levin
http://www.anterotesis.com
http://twitter.com/anterotesis
More information about the okfn-discuss
mailing list