[ckan-discuss] CKAN - Google Refine integration

William Waites ww at styx.org
Wed Apr 27 11:01:23 BST 2011


* [2011-04-26 13:24:45 +0100] Monika Solanki <monika.solanki at gmail.com> écrit:

] An important part of the last step IMHO is the provision of voID files 
] for the new RDF datasets, I assume the datasets will become a part of he 
] LOD cloud at some point, so there is a mileage in having their voID 
] descriptions besides the package metadata provided by CKAN at 
] http://semantic.ckan.net.

So this is a bit tricky. As Richard points out, semantic.ckan.net
already has some understanding of the CKAN conventions for describing
RDF datasets. In fact it understands just enough to make the diagrams
- it knows about size (triples) and linksets, and it also knows about
SPARQL endpoints to make Pierre-Yves work easier. It could be taught
about vocabularies which would have some use in e.g. the LLD
sub-cloud, and maybe the example resources, but this is the beginning
of a slippery slope. The voiD description that it makes will never be
as rich and accurate as a purposefully made voiD description by the
author. Not only is the CKAN data model lossy in this respect
requiring special cases in the translation code, but it is subject to
change in the medium term and requires a parallel metadata standard(s)
(e.g. conventions documented in the wiki).

What would not be difficult is to teach it about the resource type of
voiD description and it could fetch such from the dataset authors
where appropriate and simply incorporate that. However this raises the
question of what is put for the dataset URI. If they don't use the
CKAN one, then we have to play games with sameAs not only on the
dataset URI itself but on its link targets, otherwise making the
diagrams becomes more difficult (to say nothing about more advanced
uses as yet unthought of). However this is where I think we need to
go. The dataset authors know their data best and having them describe
it is ideal. And they can do this without relying on any central
infrastructure. These descriptions then get indexed by aggregators and
search engines so that people can find the data that they are
interested in.

One thing that might be helpful here is a hosted version of the voiD
editor, that way the dataset authors can describe them, and if there
were some sort of flexible permission scheme, others could even add
annotations, etc., in a wiki-like style. Of course this starts
recreating the editing functions of CKAN but with a perhaps more
coherent data model tailored to RDF datasets.

Cheers,
-w
-- 
William Waites                <mailto:ww at styx.org>
http://river.styx.org/ww/        <sip:ww at styx.org>
F4B3 39BF E775 CF42 0BAB  3DF0 BE40 A6DF B06F FD45



More information about the ckan-discuss mailing list