[ckan-discuss] Questions for "RDF/XML" of CKAN page

Pablo Mendes pablomendes at gmail.com
Fri Jul 15 16:47:10 BST 2011


David,
Thanks for the prompt answer.

We are in the process of cataloging all datasets following the Linked Data
principles (linkeddata.org). Besides what is already in data.gov.uk, we also
collect some provenance, accessibility, interpretability, availability,
among other quality indicators.
http://www4.wiwiss.fu-berlin.de/lodcloud/ckan/validator/levels.html

We could have pulled them into our own in-house database, but we thought it
would be a much useful effort if we contributed our metadata to CKAN (which
also allows dataset owners to add some more information they might have, and
potential users to rate the content somehow).

We currently have some crowdsourcing going on:
http://lists.w3.org/Archives/Public/public-lod/2011Jul/0059.html
And in parallel I am trying to ingest other catalogs to have the info all in
one place.

As a result of this effort, we will release a new version of the LOD cloud
diagram:
http://www4.wiwiss.fu-berlin.de/lodcloud/state/

We could discuss better ways to do these things (e.g. we could keep the
quality metadata as voID files in our own servers, and the federated CKAN
search attaches them to the search results on-the-fly). However, we need a
short-term solution.

We'd like to get this nice overview of how open data is interconnected on
the Web by next week. And, to the extend of my knowledge, pulling the
metadata for these catalogs into CKAN would be the best (temporary)
solution.

What do you think?

Cheers,
Pablo
On Fri, Jul 15, 2011 at 5:23 PM, David Read <david.read at okfn.org> wrote:

> On 15 July 2011 15:39, Pablo Mendes <pablomendes at gmail.com> wrote:
> > David,
> > very helpful message for all of us! I have a similar need. I want to pull
> > all of the metadata in data.gov.uk about datasets that contain RDF into
> > CKAN.net. There are about 160 of them.
> >
> http://data.gov.uk/search/apachesolr_search?filters=sm_resource_formats%3ARDF
> > Would the same advice apply?
>
> The data.gov.uk datasets are in CKAN format already. e.g.
>
> http://catalogue.data.gov.uk/api/rest/package/ni-115-substance-misuse-by-young-people
> so it would be relatively simple to pull data.gov.uk datasets into
> another CKAN, such as thedatahub.org. We've been considering this, but
> not sure if the 7000 of those would flood thedatahub.org somewhat. And
> since you want to stay in step, how do you deal with edits at both
> ends?
>
> Our current thinking is to aim at:
> a) allowing the public to edit/correction metadata on data.gov.uk
> (with moderation)
> b) use federated search across all CKAN instances
>
> I'd be interested to hear why you suggested pulling the RDF datasets
> from data.gov.uk into thedatahub.org/ckan.net - do let us know your
> thoughts.
>
> David
>
> > Cheers,
> > Pablo
> > On Fri, Jul 15, 2011 at 4:16 PM, David Read <david.read at okfn.org> wrote:
> >>
> >> On 15 July 2011 12:18, Tetsuro TOYODA <toyoda at base.riken.jp> wrote:
> >> > Dear David,
> >> >
> >> > Thank you for giving us your quick reply.
> >> >
> >> > http://biolod.org is the download site of biological databases funded
> by
> >> > the National Database Integration Project of Japan.
> >> >
> >> > RDF files are already generated and supplied from the site.
> >> >
> >> > Could you advise us how to register the data files including RDFs into
> >> > CKAN.
> >>
> >> Tetsuro
> >> Basically we need to write a little code that transforms the biolod
> >> RDF metadata to the CKAN JSON format.
> >>
> >> To do a one-off import then you should use the CKAN API [1]. You could
> >> do it in Python and use the ckanclient library [2].
> >>
> >> Or to do a daily automatic import you would write a harvester. Derive
> >> from the base class [3]
> >> and write a custom "_create_or_update_package" method.
> >>
> >> This way has the advantage that the original RDF is stored alongside
> >> the CKAN version. We'd be happy to install this harvester in
> >> thedatahub.org or another CKAN instance.
> >>
> >> David
> >>
> >> [1] http://packages.python.org/ckan/api.html
> >> [2] https://bitbucket.org/okfn/ckanclient
> >> [3]
> >>
> https://bitbucket.org/okfn/ckanext-harvest/src/61844c8d2374/ckanext/harvest/harvesters/base.py
> >>
> >> >
> >> > Best regards,
> >> >
> >> > Tetsuro Toyoda.
> >> >
> >> > -----Original Message-----
> >> > From: d.t.read at gmail.com [mailto:d.t.read at gmail.com] On Behalf Of
> David
> >> > Read
> >> > Sent: Friday, July 15, 2011 7:52 PM
> >> > To: Koro NISHIKATA
> >> > Cc: ckan-discuss at lists.okfn.org
> >> > Subject: Re: [ckan-discuss] Questions for "RDF/XML" of CKAN page
> >> >
> >> > 2011/7/15 Koro NISHIKATA <koro at base.riken.jp>:
> >> >> Dear all of CKAN discuss mailing list,
> >> >>
> >> >> Hello, nice to meet you.
> >> >> I am a new member of this mailing list, "Koro NISHIKATA".
> >> >
> >> > Welcome Koro!
> >> >
> >> >> I have a question about "the Data Hub API / datapkg" section of CKAN
> >> >> page.
> >> >>  http://ckan.net/package/biolod-pdb
> >> >>
> >> >> Can a user register data to "RDF/XML" ?
> >> >> Or automatically produced after register or editing ?
> >> >
> >> > CKAN accepts data via a number of ways:
> >> > * the Web page form
> >> > * API (JSON format)
> >> > * harvesters (INSPIRE, custom formats, etc)
> >> >
> >> > So you are correct - we don't accept RDF/XML. This format is one of
> >> > the exported formats, derived from the raw data.
> >> >
> >> > David
> >> >
> >> >>
> >> >> [RDF/XML]
> >>
> >> >> >>
> http://semantic.ckan.net/record//be684dad-b24a-4f59-b8e2-969229200c63.rdf
> >> >>
> >> >> Best regard,
> >> >> Koro NISHIKATA
> >> >> E-mail: koro at base.riken.jp
> >> >>
> >
> >
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.okfn.org/pipermail/ckan-discuss/attachments/20110715/995ae330/attachment.htm>


More information about the ckan-discuss mailing list