[ckan-discuss] Datacatalog metadata?!

Antti Poikola antti.poikola at gmail.com
Tue Mar 30 14:18:30 BST 2010


Hello,

Simple questions from a non-programmer again.

What metadata should be collected as required and as optional from 
public sector datasets?

I have been chasing this question allready for some time now. Is there 
any place where the data model/structure of CKAN, semantic CKAN, 
opengov.se, data.gov.uk, data.gov or any other data catalogs are documented?

Best what I've got is this link: 
http://data.gov.uk/dataset/data_gov_uk-datasets and these points 
(collected from two emails) from Peter Krantz:

---CLIP---

Here is an example entry for a dataset expressed  in RDF: 
http://pastie.org/827944

I have tried to use DC  terms for the basics. The void vocabulary is 
explicitly for semweb data  and this has to bee more generic (to be able 
to provide info about a  published spreadsheet for example). What do you 
think?

The next step is to  provide an atom feed where each entry element 
embeds some of this data  and provides a link element to the rdf: <link 
rel="alternate"  href="<url-to-rdf-data>" type="application/rdf+xml" />

I have a patch for the  opengov-catalog project ready.

I have implemented the RDF metadata on  opengov.se now. All data is in 
swedish but you get the idea if you look  at an individual dataset: 
http://www.opengov.se/data/42/

...and its RDF  representation (based on dublin core terms): 
http://www.opengov.se/data/42/rdf/

I have also made sure  an Atom feed contains all datasets (with a link 
element to the RDF  representations in each entry element) here: 
http://www.opengov.se/feeds/data/

Please note that the  feed contains datasets that are not (yet) open. 
Some may have a  commercial license and may not be available on the web.



More information about the ckan-discuss mailing list