[ckan-discuss] NZ CKAN Instance - Ideas

Jonathan Gray jonathan.gray at okfn.org
Tue Jun 15 11:23:53 BST 2010

Amazing -- thanks for your email Glen, and your comments David!

Understood that there is material on cat.open.org.nz that is not on
the official NZ portal? Hence worth doing a separate import?


On Tue, Jun 15, 2010 at 11:45 AM, David Read <david.read at okfn.org> wrote:
> Glen,
> Many thanks for sending this feedback on CKAN as a whole from your
> experience in NZ. It is incredibly useful and I think we need to sift
> through this carefully. Here's my take - please do feed back and we
> can work on some concrete improvements.
> Groups - Specifying department and sub-department can easily be
> achieved in the 'extra' fields, and that's what we do with other
> countries' government data. But it's interesting that you thought the
> 'groups' option would be way to do this, and suggests to me that we
> should do more to expose ways to browse and search by department and
> sub-department in the UI.
> Theming - We have recently introduced ways to allow CKAN visual
> templates to be customisable - see http://wiki.okfn.org/ckan/doc/theme
> .
> List of new packages / RSS feed - The 'Recently Changed' section on
> the home page goes a long way to showing new and recently edited
> packages, but we're talking about developing this for the home page
> into simple lists of 'new', 'updated' and 'hot' packages. (We have an
> Atom feed giving much of this information already, linked from the
> 'recently changed' page.)
> Jargon ('package') - We see a core advantage of CKAN is the concept of
> data packaging (the 'Debian of data' concept), so perhaps we need to
> explain that. Maybe we should rename from 'Packages' to 'Data
> Packages'? Or have a side-bar 'What is a Package of data?' linking to
> expanations.
> New package form - Yes I agree we can add more explanation text. And
> we're talking about having some customisations for different
> sorts/sources of data. We have already made it easy for an instance to
> heavily customise the form (see http://ca.ckan.net/package/new for
> example). But ckan.net being central, it needs to cater for all sorts
> of data and help the user. It would be great to have you involved in
> designing this going forward.
> Link to uses - A package has many associations. Of course plenty can
> already be added as resources: mirrors of data, scraped versions,
> derived data, SPARQL endpoints added on. You're quite right that a
> 'use' of the data would be good to link to. Property data aggregation
> site is one sort, people are plotting geo data on a map, visualising
> in other ways, combining with other datasets, writing news articles
> about the data, commenting on the methods etc. The data.gov.uk site is
> trying this out with per-package comments and wiki and I don't see
> much use of these particular features so far. Indeed their email list
> has seemed by far the most effective way to pool interesting and
> related information.
> Cost of datasets - I'm not sure we've thought much about this - we
> seem more focussed on open data. I'm not sure that advertising costs
> of data on CKAN will drive them down. In the UK, the meme of Freedom
> of Information, lobbying by Tim Berners-Lee and new economists seem to
> be the most effective.
> Search engine optimisation - excellent - we're keen to improve on
> this. I've created a ticket to collect these ideas and get it done:
> http://knowledgeforge.net/ckan/trac/ticket/350
> Harvesting other Catalogues - We've worked a great deal on tools to do
> this, with the API, getdata scripts, spreadsheet importer, changeset
> mechanism etc. Several batches of meta data from other sources have
> gone in. Going forward we need to work out metadata to target
> importing, how best to synchronise changes and feed back corrections.
> Data migration of NZ data - is this all done by Tim McNamara now or do
> we need to discuss this more?
> David
> On 12 June 2010 00:38, Glen Barnes <glen at opengovt.org.nz> wrote:
>> Hi All,
>> I had a great chat with Jonathan Gray last week about moving the NZ Open
>> Data Catalogue to the CKAN platform. He suggested I post to the list
>> outlining some of the ideas and thinking behind where I want to see things
>> go in terms of the platform so it meets our needs. Below is a bit of a
>> rambling set of things that I have - By no means exhaustive but it may spur
>> some discussion. I would be interested in feedback both for and against
>> anything below.
>> Open Data Catalogue Background
>> I had been interested in open data for a while as I was always hamstrung
>> when trying to work on projects by either the lack of data, the price or an
>> inability to actually find the data I wanted. My main interest is around
>> using open data to improve efficiencies for companies and individuals. For
>> example my day job os working for Zoodle.co.nz where we try and aggregate
>> many datasets into a single site for property information. This saves the
>> user time as they do not have to visit the Ministry of Education, Department
>> of Building and Housing and the local council website to find the
>> information they are after (Well that is the theory - some of the datasets
>> we want are not  available or way to expensive right now to give away to our
>> users for free).
>> Last year a lot of people had been talking about setting up some form of
>> catalogue but nothing had been done to date or the discussions seemed to be
>> based around standards and met data, etc. My thought was that if we got
>> something up at least it would spark some interest. On that note I sent
>> about 20 hours putting together something in WordPress and
>> launched http://cat.open.org.nz/. It has served its initial purpose to get
>> people interested and spark some interest from within government (we now
>> have a data.govt.nz and we continue to work with them to improve things).
>> Now it's time to step things up a notch and make the catalogue more useful.
>> WordPress being, well WordPress, means it is not exactly the right platform
>> to use especially if you are not a PHP programmer as we need some pretty
>> specific things to make it a better catalogue. I originally looked at the
>> Sunlight Foundations http://nationaldatacatalog.com/ code base as it was
>> Ruby/Rails and I have a little bit of experience with that and the design
>> was quite clean. I've now realised I can't really do this alone and the
>> example sites coming out from CKAN are starting to look really nice (an
>> important part of the end user experience in my mind).
>> So given the above here is my ideas around what I would like to see in the
>> CKAN codebase going forward:
>> Nested Groups
>> At the moment the 'groups' functionality only has one level
>> (http://www.ckan.net/group/). I was thinking that we would use the groups
>> feature to split out out datasets into departments like we have already
>>http://cat.open.org.nz/category/official_source/. I've built the ODC on
>> the premise that it covers every organisation covered by the Official
>> Information Act and the Local Government Official Information Act. We have
>> these nested in this way:
>> - Central Government
>> -- Public Service Department
>> -- Crown Agents, Autonomous Crown Entities, Crown Entitity Companies, Trusts
>> -- DHBs
>> -- Crown Research Institutes
>> -- Reserve Bank of New Zealand
>> -- Non Public Service Departments
>> -- Office of Parliament
>> -- Education Institutions and Wananga
>> -- State-Owned Enterprises
>> - Local Government
>> -- City and District Councils
>> -- Regional Councils and Territorial Authorities
>> Themes
>> I would like to able to do some (basic?) theming of the site. This I would
>> like to do:
>> - Custom design of the home page. I prefer the more basic 'web app' style of
>> home pages with less data, cleaner interface.
>> - Ability to pull in an RSS feed on the homepage for news items. I don't
>> think the catalogue has to add blog features but it would be good if we
>> could pull in articles from our main blog to display on the home page.
>> - Naming of items - I don't think the public knows what a package is. So
>> using terms like "dataset" throughout. Also things like "Register a new
>> package" would probably be best worded as "Add a new dataset".
>> - Submitting new datasets
>> I like the adding new dataset page here - http://www.ckan.net/package/new.
>> It may be a little bit intimidating for a few people as the form is quite
>> long. Maybe we can look at redesigning the form at some point to add some
>> instructions, look at how we select a license, change some of the wording to
>> be human ("Add row to table" -> "Add another download format"). Also for
>> different types of datasets we will have different metadata and it would be
>> good to have the form adapt to the different formats.
>> Linking to uses
>> One of the key things that people from within government want is concrete
>> examples of where this data is being used. I'm really keen of having the
>> ability for people to add links to sites where the data is being used
>> (commercial and non-profit). The user could add a title, link, description,
>> contact and optional screenshot (we could auto populate the screenshot by
>> pulling from the linked site). They user could link this site to one or more
>> datasets. Building up these case studies makes it a great one stop shop for
>> searching for uses of the information. We could then generate a pretty nice
>> report for each department showing what datasets they have,  etc.
>> Getting Access to Datasets
>> One of the key problems I have run into is a) where do we get the data from
>> and b) how much does it cost. A lot of government data is available but not
>> published anywhere online and quite often has a cost associated with it. I'm
>> really keen to get this on the catalogue so people can see where councils
>> are charging. During my latest attempt to get some council data it was going
>> to cost me $40,000/year and when I made an LGOIA request to see how much
>> they make of this dataset it was only $30K/year total.
>> I'm guessing that we can do this using custom fields but on a catalogue by
>> catalogue basis we will want to think about the metadata we want to collect
>> and format the add dataset form accordingly. Again I guess this is some form
>> of config/theming issue.
>> SEO
>> We've been pretty successful at SEO without even really trying
>> (see http://www.google.co.nz/search?client=safari&rls=en&q=auckland+google+transit+feed&ie=UTF-8&oe=UTF-8&redir_esc=&ei=dsYSTOzJLs2eceuZiI8I
>> as an example). This to me is key. If we are to make data available it has
>> to be findable which is the main reason for a catalogue. There are probably
>> things we should be doing on CKAN like using slugged urls
>> (http://www.ckan.net/package/ascoe
>> -> http://www.ckan.net/package/ascoe/atmospheric-chemistry-studies-in-the-oceanic-environment),
>> setting the H1 tag correctly ("Atmospheric Chemistry Studies in the Oceanic
>> Environment"  on the example above). Some basic SEO 101 on page
>> optimisations.
>> Harvesting other Catalogues
>> At the moment we don't harvest data from any other catalogues but I do want
>> to start by getting access to the data.govt.nz dataset (they used ours as a
>> base for theirs when they set it up). and using these external catalogues as
>> the canonical versions if that makes sense (augmented by local information
>> that they may not want to share like contact names and pricing).
>> Data Migration
>> As mentioned above the catalogue is in WordPress right now so it will have
>> to be migrated to the CKAN format. The database is WordPress formatted with
>> some custom form plugins so it is readable but hooking up the tables takes a
>> little bit of work trying to work out the right keys to join on. When we get
>> to the migration step I can give people a copy. I don't want to publish this
>> anywhere on the net as it does have email addresses of people in some of the
>> tables. Let me know via a direct email if you want to have a look at it.
>> Thanks,
>> Glen Barnes
>> New Zealand Open Data Catalogue
>> _______________________________________________
>> ckan-discuss mailing list
>> ckan-discuss at lists.okfn.org
>> http://lists.okfn.org/mailman/listinfo/ckan-discuss
> _______________________________________________
> ckan-discuss mailing list
> ckan-discuss at lists.okfn.org
> http://lists.okfn.org/mailman/listinfo/ckan-discuss

Jonathan Gray

Community Coordinator
The Open Knowledge Foundation


More information about the ckan-discuss mailing list