[ckan-discuss] Harvesting Dublin Core documents
John Bywater
john.bywater at appropriatesoftware.net
Wed Nov 24 19:26:41 GMT 2010
Hi Will,
Just a quick thought about aggregation behaviour.
William Waites wrote:
> it has even be suggested that this be the native
> interchange format between CKAN instances for aggregation.
>
I'm not exactly sure what is being meant by "the native interchange
format", but I was wondering whether it would benefit CKAN users for
CKAN to support aggregating a heterogeneity of metadata formats.
That is, CKAN could ingest a document in one of many supported formats,
write a package with values it reads from the document, and then keep a
copy of the ingested document.
When aggregating, if a package has been written from an ingested
document, server-CKAN could present the metadata records for aggregation
in their original form, and client-CKAN could write a package as if
document had been ingested for the first time. That would pretty much
guarantee lossless transmission through a chain, if there ever was such
a thing.
For locally edited package (which would have no ingested document) I
would think that directly passing the native JSON format for locally
edited packages would be the simplest thing to do. Version differences
are already supported via the versioning of the API. You could even put
the aggregator outside the API, reading from one API and writing to
another. (I should admit, I don't yet see what would prompt me to do
that with DCat?)
Of course, we could (somewhere) add more support for presenting CKAN
packages in different formats. We could present everything as one format
or another, but it would be tricky to have a lossless homogenisation. So
that might not be the way to do aggregation.
As I said, I'm not sure what is meant by "the native interchange
format", and I'm certainly not the world's expert on DCat, but I hope
these considerations were at least interesting to read. :-)
Best wishes,
John.
More information about the ckan-discuss
mailing list