[okfn-discuss] Historical British Statutes

John Levin john at technolalia.org
Tue Nov 1 13:24:06 UTC 2016


Dear all,

I have started a wildly ambitious project to get the entirety of English 
/ British statutes online, in plain text and under a public domain mark.

http://statutes.org.uk/
https://github.com/Anterotesis/statutes

The idea is to use OCR the numerous volumes of statutes digitized by 
Google et al, correct the resultant alphanumeric soup into something 
more useable, produce versions suitable for both humans and machines, 
and organize the texts in such a way that they can be easily discovered, 
searched and examined.

And without spending the rest of my life proofing these horrendously 
turgid texts.

It is of course early days, and so far I have simply been OCRing; nearly 
100 vols are up on github, with more to come. The OCR is poor, of 
course, but tests with correcting and normalizing software have been 
encouraging.

Any ideas, interest, etc, welcomed.

John

-- 
John Levin
http://www.anterotesis.com
http://twitter.com/anterotesis


More information about the okfn-discuss mailing list