[ckan-discuss] Data Registry Aggregator Experiment

William Waites ww at styx.org
Wed Mar 30 13:23:53 BST 2011


* [2011-03-30 12:30:51 +0100] David Read <david.read at okfn.org> écrit:

] My point was - let's try not to duplicate effort here. We've got three
] sorts of CKAN aggregators already - it sounds like this needs a
] discussion before too many more are written!

In this case actually some duplicated effort is
warranted. Synchronising metadata in a distributed system is a hard
problem. To solve it we will need to try implementing it a few
different ways and see what works best, and then throw out the others
or repurpose them. I don't think this will be solved by up-front
design work and minimising duplicated effort.

However I do think that we, in the context of LOD2 should take stock
of the different aggregation/federation techniques that have been 
looked at tried out and see what sort of state they are in, what 
features they have (in design or in fact), what their scaling 
properties are like, etc. etc.. I don't think this should be a
discussion so much as a collaboratively produced state of the art
report which, incidentally is part of one of our deliverables for
LOD2.

] We already have a 303 redirect in the CKAN core, with ckan.net setup
] for http://semantic.ckan.net/package/ . I'm kind of wondering if it
] would not be better to have the RDF produced in a CKAN extension and
] therefore served by CKAN as an alternative format for the package to
] JSON? This solves the problem of keeping up to date, purged & deleted
] packages, permissions etc. which you have with semantic.ckan.net I
] guess? These are not currently major issues, but are the sort of
] things which crop up when you structure the RDF creation outside the
] CKAN framework. Or is there serious value in keeping it separate?

I favour a far more loosely coupled approach. The "open world" nature
of RDF means that we encourage others to contribute augmented package
metadata. For example, this might be a group curator that contributes
some other statements based on the meaning of some extras that they
use and agree upon within that group.

We cannot just produce RDF centrally because we will never have enough
information about the meanings of extras and tags and such and how
they should be represented.

If we aren't producing them centrally then it means other people are
doing it. If other people are doing it then we cannot impose our
development environment and practices on them. They might like to
write software in PHP or Perl or Ruby or even Go.

This is not something that has been solved by this experiment, how
exactly to accept community contributed augmentations of metadata
and make it available is still an open question. But if we move to
centralise the production then it will make solving this problem
harder.

Cheers,
-w
-- 
William Waites                <mailto:ww at styx.org>
http://river.styx.org/ww/        <sip:ww at styx.org>
F4B3 39BF E775 CF42 0BAB  3DF0 BE40 A6DF B06F FD45



More information about the ckan-discuss mailing list