[ckan-discuss] Publishing to CKAN the Southampton Way

Wed Mar 30 18:12:16 BST 2011

The API worked OK. It was a bit annoying not to be able to use the 
standard URI for the license, but easily mapped to your ID.

I think the technique is eminently repeatable, but a pull would make it 
more simple.

What we have found is that our initial dataset entries may require an 
overhaul. Some things are needlessly divided up. Which is annoying 
because we could get mileage out of the number of datasets.

I'm also considering what recommendations we should make to JISC/JANET 
about a .ac.uk catalogue, which may make sense to be a pull.

On 30/03/11 14:48, Rufus Pollock wrote:
> On 10 March 2011 15:16, Christopher Gutteridge<cjg at ecs.soton.ac.uk>  wrote:
>> Rufus has asked me to drop into the list to share what we've done to
>> integrate data.southampton.ac.uk into CKAN.
> I think this is really exciting Chris and thanks for posting to the list.
>
>> Our datasets have URI of the form
>> http://id.southampton.ac.uk/dataset/places
>> and HTML pages
>> http://data.southampton.ac.uk/dataset/places.html
>> they also have the, now traditional range of linked data API options that
>> aren't quite working yet for me.
>> http://data.southampton.ac.uk/dataset/places.rdf
>> http://data.southampton.ac.uk/dataset/places.xml
>> you know the kind of thing.
>>
>> They also have a couple of unusual export views
>> http://data.southampton.ac.uk/dataset/places.ckan.json
>> http://data.southampton.ac.uk/dataset/places.boilerplate.rdf
>>
>> The first of these provides the JSON to inject into the CKAN API to update
>> or create the dataset, the second generates the triples which get appended
>> to the dataset each time it's published. (License etc.) The scripts which do
>> this are actually entirely different from the ones which do places.html and
>> there's .htaccess jiggerypokery, but the result seems more elegant to my
>> sysprog eyes.
> Understood -- the point is you are always appending some standard set
> of attributes (e.g. your standard license) when exporting to CKAN.
>
>> And of course; our scripts are available to all:
>> https://github.com/cgutteridge/Grinder/tree/master/bin
> Thanks a lot. Do you have any feedback (positive or negative) on the
> API based on your experience using it?
>
> I'm also interested because it seems one could generalize this
> approach for other people publishing data -- another approach is a
> pull model in which CKAN harvests data from your end (given a
> harvesting endpoint ...)
>
> Rufus

-- 
Christopher Gutteridge -- http://id.ecs.soton.ac.uk/person/1248

/ Lead Developer, EPrints Project, http://eprints.org/
/ Web Projects Manager, ECS, University of Southampton, http://www.ecs.soton.ac.uk/
/ Webmaster, Web Science Trust, http://www.webscience.org/