[ckan-discuss] 262 New Zealand, 70 Australian datasets happily in CKAN

Rufus Pollock rufus.pollock at okfn.org
Mon Jun 14 09:17:16 BST 2010

On 13 June 2010 03:46, Tim McNamara <paperless at timmcnamara.co.nz> wrote:
> Hi all,
> Just a heads up that every dataset from http://data.australia.gov.au &
> http://data.govt.nz is now also found at CKAN

Amazing work Tim. BTW, if you'd be happy to, we'd love to store a copy
of your scripts in our ckanext repo under, say, "nz" and "australia":


(If you happen to be using python mercurial easiest way may be to fork
from our mirror on bitbucket: <http://bitbucket.org/okfn/ckanext>, add
changes and then we'll pull)

> There are few things that I would like to do some finishing touches on, but
> will postpone that work until next weekend:
> New Zealand datasets are not populated with tags by my script. New Zealand's
> tagging system seems overly complicated. Using rel='tag' in links would have
> been an easier approach.
> Generate format-xls/format-pdf/etc tags depending on the file types.

Even better you could add this format info (if you haven't already) to
the resources (they now have a format attribute) and we're sort of
deprecating use of tags for specifying formats in favour of these
dedicated fields.

> Licencing information is not yet being sent to CKAN in the format it wants.
> I have included the original text, but getting a machine to guess what the
> licence is and then match it to the id codes of licences that are already
> accepted seems like a task for when I need to procrastinate. Basically, the
> full licence details appear in the details section down the bottom of each
> page, but not in the info box on the top right.

The question of how we deal with licenses effectively going forward is
an interesting one. Several people have already suggested a dedicated
free text field for license info in addition to the enumeration.
Personally, I feel if it is free text you may as well put that info in
the notes ...

> None of the files have hashes. I am reluctant to add my own hashes to the
> downloaded files, because I can't ensure their authenticity as a third
> party.

Very reasonable, though at this point I think it will be need to be
3rd parties who add them -- or even a bot we use to go through
nightly. (Perhaps we then add a suitable notice no the site). One
reason this is useful is not just doing authenticity but being to tell
when files have been updated (but name kept the same).


More information about the ckan-discuss mailing list