[ckan-dev] Looking for a recent JSON dump from the Data Hub

Paul Miller paul.miller at cloudofdata.com
Tue Jul 31 12:21:37 UTC 2012


Good afternoon

I've been doing some work with the JSON dump at http://thedatahub.org/dump/. However, this is over a year old, and I'm now trying to get hold of a more current data set.

Mark Wainwright at the Open Knowledge Foundation suggested that I ask here. So… does anyone have a more current JSON dump, or know of an easy way for me to get myself one?

Many thanks

Paul

And, for those who want some background, or who can see a far better way to do what I'm trying to do… some background.

I'm looking at the occurrence of different licenses in the Data Hub data. Using the CKAN api, it's easy to see a list of permissible licenses; http://thedatahub.org/api/1/rest/licenses

It's then straightforward to step through the JSON dump, getting a count of occurrences of different values for license_id. In the year-old data set I've got, there are just over 2,000 records (half the number currently available in the Data Hub), and about a third of those have license values (mostly 'null', but also others like 'apache' and 'gpl-2.0' '3.0') that aren't part of the set of values reported from http://thedatahub.org/api/1/rest/licenses. It's therefore useful (to me) to be able to visually skim the file for these odd values, rather than simply querying the api for known terms…

Any help gratefully received.


	Dr Paul Miller
Cloud of Data
 
cloudofdata.com/contact

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.okfn.org/pipermail/ckan-dev/attachments/20120731/9faad813/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 4373 bytes
Desc: not available
URL: <http://lists.okfn.org/pipermail/ckan-dev/attachments/20120731/9faad813/attachment-0002.bin>


More information about the ckan-dev mailing list