[ckan-discuss] FW: Re: Provenance of datasets on CKAN

William Waites ww at styx.org
Fri Feb 18 11:21:44 GMT 2011


Forwarding from yesterday regarding mechanism for ensuring provenance
of datasets...

Responding to this, James Gardner said it would be nice to have a
mechanism for linking code snippets that show how things may have
been processed, I agree, and the place to reference them is from
the "Process" box in the ascii art diagram I sent to the list the
other day - caveat that a specific version of the code snippet has
to be referenced.

Cheers,
-w

----- Forwarded message from William Waites <ww at styx.org> -----

Date: Thu, 17 Feb 2011 12:07:50 +0100
From: William Waites <ww at styx.org>
Subject: Re: Provenance of datasets on CKAN
To: Jonathan Gray <jonathan.gray at okfn.org>
Cc: Friedrich Lindenberg <friedrich at pudo.org>,
	Rufus Pollock <rufus.pollock at okfn.org>,
	James Gardner <james at 3aims.com>, ckan-team at okfn.org
Message-ID: <20110217110750.GB11303 at styx.org>
User-Agent: Mutt/1.4.2.3i

short answer: it doesn't

longer answer: if there are links to source metadata, web page, dataset stored externally, the interested reader can evaluate these and their location, a bogus or contradictory package could be identified similarly to how bogus material on wiikipedia is identified

yet longer answer: it is possible to use package extras or resource extras to hold e.g. pgp signatures of data. to establish some idea of provenance you have to check the signatures and evaluate the trust of the signing key. this requires explicit measures taken by the publisher.

and when we start talking about derived datasets even if the derivation is a simple format transformation we need a way to talk about the process involved, the source data used, who performed the operation and when... this is not well modelled in ckan at present and needs fleshing out - there are some discussions ongoing on the list that touch on these ideas but do not explicitly address them.

-w
-- 
William Waites                <mailto:ww at styx.org>
http://river.styx.org/ww/        <sip:ww at styx.org>
F4B3 39BF E775 CF42 0BAB  3DF0 BE40 A6DF B06F FD45

----- End forwarded message -----



More information about the ckan-discuss mailing list