[ckan-dev] The future of resources.

David Raznick david.raznick at okfn.org
Mon Jan 24 09:37:04 UTC 2011


Hello

I have almost completed the work for ticket http://ckan.org/ticket/826.  See
https://bitbucket.org/kindly/ckan/changeset/5769ff6ac34f.  (do not pull yet)

I do not think anyone will be happy with it, but regardless I think it
should stand.   Hopefully it will displease all with equal measure.

It adds a config option that means you can add more resource fields.   i.e
you could add a alternative_url field.  The new fields act as if they were
normal database fields to the developer/form designer. So they are
attributes on an object the same way url, description and hash are.  You can
search on them too using the sql backend.  Here are a list of the pros and
cons I can think of.

Pros.

 * Simple
 * The smallest change I can think of to complete the ticket, and give
clients the custom extra fields they need in the short term.
 * The status quo has not changed concerning resources.

Cons.

 * It will make relational purists unhappy as it uses a json (in fact just a
dict jsononifed)  field.
 * Will make nosql advocates unhappy as this is the thing they are designed
to do.
 * Will make semantic web advocates unhappy as it adds nothing (and possibly
even muddies) the classification of these resources.
 * Will make wiki style collaboration enthusiasts unhappy as it does not
give the flexibility they need.
 * It makes me unhappy for the all the above reasons.

I still think the pros outweigh the cons.

So onto the topic what we need to do with resources in the long run.   Here
are my opinions.

 *  Resources should be made first class citizens in ckan.  For the simple
reason that essentially THEY ARE THE DATA.  I have added a ticket
http://ckan.org/ticket/922  that outlines this.
 *  They should at least have their own form.  We can not squeeze all the
information we need to describe them properly into a small table in
packages.
 *  There should be means of versioning and dating them
properly amongst each other.  e.g  a way of saying this is the latest
version of the csv file and it was from this date on this topic.  I think
manual versioning is better here than against a package (the packages
version should just pick up the latest resource version).
 *  We should give people the option of duplicating them.
 *  We should be providing tools, access, guidance and lookups to
ontologies, to help people classify the data/resources properly.
 *  We should give tools beyond just previewing the data, to actually help
people semantically analyse and convert the data itself.  Stuffing in a link
to a random excel file is not that great
 *  Potentially provide basic visualisations of the resource.
 *  Potentially develop, host and encourage data
cleaning/augmenting/clustering tools (like google refine), to help people
get their data in good state.

So the work on them is not nearly over...

Regards

David
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.okfn.org/pipermail/ckan-dev/attachments/20110124/b286e61e/attachment-0001.html>


More information about the ckan-dev mailing list