[ckan-discuss] Fwd: Re: Getting Environment Agency Bathing Water Quality info register in LOD cloud

Thu Mar 1 15:21:26 GMT 2012

As per Rufus' request...

-------- Original Message --------
Subject: 	Re: Getting Environment Agency Bathing Water Quality info register in 
LOD cloud
Date: 	Thu, 1 Mar 2012 15:14:07 +0000
From: 	Rufus Pollock <rufus.pollock at okfn.org>
Reply-To: 	rufus.pollock at okfn.org
To: 	Stuart Williams <skw at epimorphics.com>
CC: 	Sean Hammond <sean.hammond at okfn.org>, support at ckan.org, Alex Coley 
<alex.coley at environment-agency.gov.uk>

Just a comment: would it be possible to move this discussion to
ckan-discuss. support at ckan.org is usually reserved for technical
support for the CKAN software :-) (Plus this is an interesting thread
that others would benefit from seeing).

If you are happy to move it there would you mind resending your
original email and then this follow up ...

Rufus

On 1 March 2012 15:11, Stuart Williams<skw at epimorphics.com>  wrote:
>  On 01/03/2012 14:33, Sean Hammond wrote:
>>
>>  Hi Stuart,
>>
>>>  The guidance at:
>>>
>>>  http://www4.wiwiss.fu-berlin.de/lodcloud/ckan/validator/validate.php<http://www4.wiwiss.fu-berlin.de/lodcloud/ckan/validator/validate.php>
>>>
>>>  leads rapidly to a need to register the dataset(s) at
>>>  http://thedatahub.org/ so I've started that process, but have
>>>  probably 'botched' things as the site is a little opaque. I've
>>>  created a user ID (skwlilac) and so far made an attempt to register
>>>  the first of all of these data sets
>>>  (/dataset/environment-data-gov-uk-bathing-water-quality) . There is
>>>  a common SPARQL endpoint for all of them which I could expose,
>>>  though there are link-following URI that reach out to both source
>>>  CSVs as well as static dumpfiles generated for each subset in the
>>>  aggregate.
>>>
>>>  I can find no way to delete the record that I've started and start
>>>  again! I'd be grateful of someone looking at what I've done so far
>>>  an letting me know whether or not I'm headed in the right direction.
>>>  If I have really 'botched' please could you delete the record I've
>>>  created and advise on how I should go forward
>>
>>  I think you're headed in the right direction. You've created your
>>  dataset and added the first resource link to it, specifying the
>>  application/rdf+xml format. Unfortunately the link you've entered for
>>  the resource is to the HTML page and to the RDF+XML file, perhaps you
>>  could edit the resource to link directly to the RDF?
>>
>>  http://environment.data.gov.uk/data/bathing-water-quality.xml
>
>  Ok... could make it to the .rdf  (the .xml is a linked-data-api format .xml,
>  not itself RDF).
>
>  The undecorated URI content-negotiates... if you have application/rdf+xml in
>  the request header AND don't include text/html then rdf/xml is what you
>  should get.
>
>  We give an active bias in favour of html if it is one of the accept options
>  - some webkit based browsers accept multiple and have screwed up 'q' values.
>
>  But a trailing .rdf will force application/rdf+xml (though is not quite
>  right for naming the dataset in the abstract - I'd much rather it be the
>  undecorated URI that propagate into the world as the primary identifier.
>
>
>>  It looks like your data is also available in csv, json, xml, text and
>>  ttl formats. You could add further resource links to your dataset for
>>  each of those files.
>
>
>  These as all served out of a triple store - the data at the dataset entry
>  point (ie. the URI above) is mostly VoID about with subset references to it
>  substructure.
>
>
>>  Finally, your environment.data.gov.uk page also has a link to a SPARQL
>>  endpoint which you could also add to your thedatahub.org dataset as a
>>  resource link (specify the format as "api/sqarql").
>
>  The SPARQL endpoint is an aggregate endpoint for a composite triplestore
>  with all the datasets, reference data and vocabularies in a default graph.
>
>  I/we really don't want folks trying to dump the dataset via the obvious
>  sparql query - I/we'd much rather they followed the void to the
>  ntriple/turtle dumps and let the filesystem take the load rather than a
>  query engine. But as it is exposed anyway, we may as well give a pointer -
>  if it become a DOS problem for us we can moderate its use.
>
>  Many thanks
>
>  Stuart
>  --
>
>
>  --
>  Epimorphics Ltd                        www.epimorphics.com
>  Court Lodge, 105 High Street, Portishead, Bristol BS20 6PT
>  Tel: 01275 399069
>
>  Epimorphics Ltd. is a limited company registered in England (number 7016688)
>  Registered address: Court Lodge, 105 High Street, Portishead, Bristol BS20
>  6PT, UK
>

-- 
Co-Founder, Open Knowledge Foundation
Promoting Open Knowledge in a Digital Age
http://www.okfn.org/ - http://blog.okfn.org/