[okfn-discuss] British Library data JSON wrangling
John Levin
john at technolalia.org
Wed Nov 9 11:14:24 UTC 2016
Works like a charm! Thank you *very* much!
Strongly suggest you add the csv file to data.bl.uk - it means people
can get a decent overview of what's contained within, and do simple
searches for material.
Thanks once again,
John
On 08/11/2016 11:05, Ben O'Steen wrote:
> Sorry about that! The main request that I was trying to satisfy is the
> demand for "everything". https://data.bl.uk is meant for all the items
> that we cannot deliver through other online means or by shipping a
> harddrive.
>
> The JSON structure links together a number of services that have no easy
> process to gain machine-readable connections between them, mainly
> Flickr, the BL catalog and the access portal system that you can (should
> be able to) download the PDFs from.
>
> I have loaded the JSON file into OpenRefine (and, incidentally, it can
> be opened with python's json module. It was also created with this
> module as well.)
>
> If you have python installed, this script will created a flattened UTF-8
> encoded CSV file from the json, with most fields included:
>
> https://gist.github.com/benosteen/7dd20109bbdf7716218ba73279c70a3c
>
> I can add the resultant CSV to the item record if that would be useful?
>
>
> Ben
>
>
> On 8 November 2016 at 10:00, Ian Ibbotson <ian.ibbotson at k-int.com
> <mailto:ian.ibbotson at k-int.com>> wrote:
>
> I don't know that this will help, but I think those resources are
> also loaded into Jisc historical texts
> at https://historicaltexts.jisc.ac.uk
> <https://historicaltexts.jisc.ac.uk> -- for example
> https://historicaltexts.jisc.ac.uk/results?terms=A%20Gossip%20about%20Old%20Manchester.%20With%20illustrations
> <https://historicaltexts.jisc.ac.uk/results?terms=A%20Gossip%20about%20Old%20Manchester.%20With%20illustrations>
> I think that there is an elasticsearch index underpinning the
> collections in JHT -- You don't say what you would like to extract
> from the data, but someone at JHT might be able to help? Might be
> worth dropping a line to the JHT enquiries address, YMMV tho.
>
> best,
> Ian.
>
> Ian Ibbotson
> Director
> Knowledge Integration Ltd
> 35 Paradise Street, Sheffield. S3 8PZ
> T: 0114 273 8271
> M: 07968 794 630
> W: http://www.k-int.com
> Doodle: http://doodle.com/ianibbo <http://doodle.com/ianibbo>
>
> On 8 November 2016 at 08:41, John Levin <john at technolalia.org
> <mailto:john at technolalia.org>> wrote:
>
> Dear list,
>
> The British Library has just launched
> https://data.bl.uk/
> with data sets including some 50,000 digitized books from 1510
> to 1946.
>
> Infuriatingly, there isn't a simple manifest of these books.
> There is an enormous (50mb) JSON file
> https://data.bl.uk/digbks/db21.html
> <https://data.bl.uk/digbks/db21.html>
> which I've been trying to wrangle with little success.
>
> What's the best way of getting information out of this blob? ANy
> help for a JSON newbie?
>
> TIA
>
> John
>
--
John Levin
http://www.anterotesis.com
http://twitter.com/anterotesis
More information about the okfn-discuss
mailing list