[ckan-discuss] RDF for data catalogues (was: Re: Universal distributed open government data catalog?)

Martín Álvarez Espinar martin.alvarez at fundacionctic.org
Mon Feb 22 09:56:01 GMT 2010


Hello,

We had though of representing the list of all available public 
catalogues in RDF. Having in mind voiD and SCOVO, we tried to represent 
Catalogues which are composed of Datasets. These catalogues are 
described using FOAF and Dublin Core metadata and are available from our 
SPARQL endpoint [1].

This is an example of proposed catalog represented using this vocabulary 
[2]:

:data.gov a cat:Catalog ;
    dcterms:identifier "data.gov" ;
    foaf:homepage <http://www.data.gov/catalog> ;
    rdfs:label "US Federal Government Catalog" ;
    dcterms:description "The purpose of Data.gov is...." ;
    dcterms:language "en" ;
    dcterms:issued "2009-05-21"^^xsd:date ;
    dcterms:license <http://www.data.gov/datapolicy> ;
    dcterms:spatial <http://sws.geonames.org/6252001/> .

Every catalogue is enriched with Linked Data. In this example, the 
"dcterms:spatial" property links to the area which are covered by 
data.gov (USA or http://sws.geonames.org/6252001/).

Using this information from Geonames, we can build some representations 
of these catalogues like simple listings [3] or maps [4].

Best regards,

Martin

[1] http://data.fundacionctic.org/sparql
[2] http://data.fundacionctic.org/vocab/catalog/datasets.html.en
[3] http://datos.fundacionctic.org/sandbox/catalog/index.html.en
[4] http://datos.fundacionctic.org/sandbox/catalog/map


Rufus Pollock escribió:
> On 17 February 2010 22:56, Peter Krantz <peter.krantz at gmail.com> wrote:
>   
>> On Tue, Feb 2, 2010 at 18:59, Ed Summers <ehs at pobox.com> wrote:
>>     
>>> My personal opinion is that a key ingredient to making this happen is
>>> to publish dataset availability and metadata using a syndicated feed
>>> (Atom and/or RSS).
>>>       
>> I have implemented the RDF metadata on opengov.se now. All data is in
>> swedish but you get the idea if you look at an individual dataset:
>>
>> http://www.opengov.se/data/42/
>>
>> ...and its RDF representation (based on dublin core terms):
>>
>> http://www.opengov.se/data/42/rdf/
>>     
>
> Great stuff Peter. For comparison, here's an example of what you get
> from ckan.net + semantic.ckan.net:
>
> http://pastie.org/830693
>
> At the moment we redirect into semantic.ckan.net from ckan.net via a
> rel=alternative and 303 on the Accept header, e.g try out:
>
> curl -L -H "Accept: application/rdf+xml"
> http://ckan.net/package/2000-us-census-rdf
>
> semantic.ckan.net also provides a human readable version of the data:
>
> <http://semantic.ckan.net/data/2000-us-census-rdf>
>
> We've thought quite a bit about integrating directly into ckan.net
> (hence the /data/ rather than /package/ on semantic.ckan.net) but the
> issue here is that we want to use a proper triple store for the data
> so you can query via sparql (currently
> http://semantic.ckan.net/sparql). Thus we've gone for the separate but
> related model for the present.
>
> Maybe it would be worth getting together for half-an-hour on skype and
> etherpad to work on hammering out a shared ontology here? I also know
> the people from DERI (Richard Cyganiak especially) are working on this
> so we should talk with them.
>
>   
>> I have also made sure an Atom feed contains all datasets (with a link
>> element to the RDF representations in each entry element) here:
>>
>> http://www.opengov.se/feeds/data/
>>     
>
> Great. Like you we should add RDF link to our atom feed (which as I've
> already mentioned can be found at
> http://www.ckan.net/revision/list?format=atom&days=30)
>
>   
>> Please note that the feed contains datasets that are not (yet) open.
>> Some may have a commercial license and may not be available on the
>> web.
>>     
>
> That's also true for us ;)
>
> Regards,
>
> Rufus
>   
-- 

Martín Álvarez Espinar
CTIC-Centro Tecnológico

Parque Científico y Tecnológico de Gijón
c/ Ada Byron, 39 Edificio Centros Tecnológicos
33203 Gijón - Asturias - España
Tel.: +34 984 29 12 12
Fax: +34 984 39 06 12
E-mail: martin.alvarez at fundacionctic.org
http://www.fundacionctic.org
Política de Privacidad: http://www.fundacionctic.org/privacidad




More information about the ckan-discuss mailing list