[ckan-discuss] NZ CKAN Instance - Ideas

David Read david.read at okfn.org
Tue Jun 15 10:45:59 BST 2010


Many thanks for sending this feedback on CKAN as a whole from your
experience in NZ. It is incredibly useful and I think we need to sift
through this carefully. Here's my take - please do feed back and we
can work on some concrete improvements.

Groups - Specifying department and sub-department can easily be
achieved in the 'extra' fields, and that's what we do with other
countries' government data. But it's interesting that you thought the
'groups' option would be way to do this, and suggests to me that we
should do more to expose ways to browse and search by department and
sub-department in the UI.

Theming - We have recently introduced ways to allow CKAN visual
templates to be customisable - see http://wiki.okfn.org/ckan/doc/theme

List of new packages / RSS feed - The 'Recently Changed' section on
the home page goes a long way to showing new and recently edited
packages, but we're talking about developing this for the home page
into simple lists of 'new', 'updated' and 'hot' packages. (We have an
Atom feed giving much of this information already, linked from the
'recently changed' page.)

Jargon ('package') - We see a core advantage of CKAN is the concept of
data packaging (the 'Debian of data' concept), so perhaps we need to
explain that. Maybe we should rename from 'Packages' to 'Data
Packages'? Or have a side-bar 'What is a Package of data?' linking to

New package form - Yes I agree we can add more explanation text. And
we're talking about having some customisations for different
sorts/sources of data. We have already made it easy for an instance to
heavily customise the form (see http://ca.ckan.net/package/new for
example). But ckan.net being central, it needs to cater for all sorts
of data and help the user. It would be great to have you involved in
designing this going forward.

Link to uses - A package has many associations. Of course plenty can
already be added as resources: mirrors of data, scraped versions,
derived data, SPARQL endpoints added on. You're quite right that a
'use' of the data would be good to link to. Property data aggregation
site is one sort, people are plotting geo data on a map, visualising
in other ways, combining with other datasets, writing news articles
about the data, commenting on the methods etc. The data.gov.uk site is
trying this out with per-package comments and wiki and I don't see
much use of these particular features so far. Indeed their email list
has seemed by far the most effective way to pool interesting and
related information.

Cost of datasets - I'm not sure we've thought much about this - we
seem more focussed on open data. I'm not sure that advertising costs
of data on CKAN will drive them down. In the UK, the meme of Freedom
of Information, lobbying by Tim Berners-Lee and new economists seem to
be the most effective.

Search engine optimisation - excellent - we're keen to improve on
this. I've created a ticket to collect these ideas and get it done:

Harvesting other Catalogues - We've worked a great deal on tools to do
this, with the API, getdata scripts, spreadsheet importer, changeset
mechanism etc. Several batches of meta data from other sources have
gone in. Going forward we need to work out metadata to target
importing, how best to synchronise changes and feed back corrections.

Data migration of NZ data - is this all done by Tim McNamara now or do
we need to discuss this more?


On 12 June 2010 00:38, Glen Barnes <glen at opengovt.org.nz> wrote:
> Hi All,
> I had a great chat with Jonathan Gray last week about moving the NZ Open
> Data Catalogue to the CKAN platform. He suggested I post to the list
> outlining some of the ideas and thinking behind where I want to see things
> go in terms of the platform so it meets our needs. Below is a bit of a
> rambling set of things that I have - By no means exhaustive but it may spur
> some discussion. I would be interested in feedback both for and against
> anything below.
> Open Data Catalogue Background
> I had been interested in open data for a while as I was always hamstrung
> when trying to work on projects by either the lack of data, the price or an
> inability to actually find the data I wanted. My main interest is around
> using open data to improve efficiencies for companies and individuals. For
> example my day job os working for Zoodle.co.nz where we try and aggregate
> many datasets into a single site for property information. This saves the
> user time as they do not have to visit the Ministry of Education, Department
> of Building and Housing and the local council website to find the
> information they are after (Well that is the theory - some of the datasets
> we want are not  available or way to expensive right now to give away to our
> users for free).
> Last year a lot of people had been talking about setting up some form of
> catalogue but nothing had been done to date or the discussions seemed to be
> based around standards and met data, etc. My thought was that if we got
> something up at least it would spark some interest. On that note I sent
> about 20 hours putting together something in WordPress and
> launched http://cat.open.org.nz/. It has served its initial purpose to get
> people interested and spark some interest from within government (we now
> have a data.govt.nz and we continue to work with them to improve things).
> Now it's time to step things up a notch and make the catalogue more useful.
> WordPress being, well WordPress, means it is not exactly the right platform
> to use especially if you are not a PHP programmer as we need some pretty
> specific things to make it a better catalogue. I originally looked at the
> Sunlight Foundations http://nationaldatacatalog.com/ code base as it was
> Ruby/Rails and I have a little bit of experience with that and the design
> was quite clean. I've now realised I can't really do this alone and the
> example sites coming out from CKAN are starting to look really nice (an
> important part of the end user experience in my mind).
> So given the above here is my ideas around what I would like to see in the
> CKAN codebase going forward:
> Nested Groups
> At the moment the 'groups' functionality only has one level
> (http://www.ckan.net/group/). I was thinking that we would use the groups
> feature to split out out datasets into departments like we have already
>http://cat.open.org.nz/category/official_source/. I've built the ODC on
> the premise that it covers every organisation covered by the Official
> Information Act and the Local Government Official Information Act. We have
> these nested in this way:
> - Central Government
> -- Public Service Department
> -- Crown Agents, Autonomous Crown Entities, Crown Entitity Companies, Trusts
> -- DHBs
> -- Crown Research Institutes
> -- Reserve Bank of New Zealand
> -- Non Public Service Departments
> -- Office of Parliament
> -- Education Institutions and Wananga
> -- State-Owned Enterprises
> - Local Government
> -- City and District Councils
> -- Regional Councils and Territorial Authorities
> Themes
> I would like to able to do some (basic?) theming of the site. This I would
> like to do:
> - Custom design of the home page. I prefer the more basic 'web app' style of
> home pages with less data, cleaner interface.
> - Ability to pull in an RSS feed on the homepage for news items. I don't
> think the catalogue has to add blog features but it would be good if we
> could pull in articles from our main blog to display on the home page.
> - Naming of items - I don't think the public knows what a package is. So
> using terms like "dataset" throughout. Also things like "Register a new
> package" would probably be best worded as "Add a new dataset".
> - Submitting new datasets
> I like the adding new dataset page here - http://www.ckan.net/package/new.
> It may be a little bit intimidating for a few people as the form is quite
> long. Maybe we can look at redesigning the form at some point to add some
> instructions, look at how we select a license, change some of the wording to
> be human ("Add row to table" -> "Add another download format"). Also for
> different types of datasets we will have different metadata and it would be
> good to have the form adapt to the different formats.
> Linking to uses
> One of the key things that people from within government want is concrete
> examples of where this data is being used. I'm really keen of having the
> ability for people to add links to sites where the data is being used
> (commercial and non-profit). The user could add a title, link, description,
> contact and optional screenshot (we could auto populate the screenshot by
> pulling from the linked site). They user could link this site to one or more
> datasets. Building up these case studies makes it a great one stop shop for
> searching for uses of the information. We could then generate a pretty nice
> report for each department showing what datasets they have,  etc.
> Getting Access to Datasets
> One of the key problems I have run into is a) where do we get the data from
> and b) how much does it cost. A lot of government data is available but not
> published anywhere online and quite often has a cost associated with it. I'm
> really keen to get this on the catalogue so people can see where councils
> are charging. During my latest attempt to get some council data it was going
> to cost me $40,000/year and when I made an LGOIA request to see how much
> they make of this dataset it was only $30K/year total.
> I'm guessing that we can do this using custom fields but on a catalogue by
> catalogue basis we will want to think about the metadata we want to collect
> and format the add dataset form accordingly. Again I guess this is some form
> of config/theming issue.
> We've been pretty successful at SEO without even really trying
> (see http://www.google.co.nz/search?client=safari&rls=en&q=auckland+google+transit+feed&ie=UTF-8&oe=UTF-8&redir_esc=&ei=dsYSTOzJLs2eceuZiI8I
> as an example). This to me is key. If we are to make data available it has
> to be findable which is the main reason for a catalogue. There are probably
> things we should be doing on CKAN like using slugged urls
> (http://www.ckan.net/package/ascoe
> -> http://www.ckan.net/package/ascoe/atmospheric-chemistry-studies-in-the-oceanic-environment),
> setting the H1 tag correctly ("Atmospheric Chemistry Studies in the Oceanic
> Environment"  on the example above). Some basic SEO 101 on page
> optimisations.
> Harvesting other Catalogues
> At the moment we don't harvest data from any other catalogues but I do want
> to start by getting access to the data.govt.nz dataset (they used ours as a
> base for theirs when they set it up). and using these external catalogues as
> the canonical versions if that makes sense (augmented by local information
> that they may not want to share like contact names and pricing).
> Data Migration
> As mentioned above the catalogue is in WordPress right now so it will have
> to be migrated to the CKAN format. The database is WordPress formatted with
> some custom form plugins so it is readable but hooking up the tables takes a
> little bit of work trying to work out the right keys to join on. When we get
> to the migration step I can give people a copy. I don't want to publish this
> anywhere on the net as it does have email addresses of people in some of the
> tables. Let me know via a direct email if you want to have a look at it.
> Thanks,
> Glen Barnes
> New Zealand Open Data Catalogue
> _______________________________________________
> ckan-discuss mailing list
> ckan-discuss at lists.okfn.org
> http://lists.okfn.org/mailman/listinfo/ckan-discuss

More information about the ckan-discuss mailing list