[open-development] Data catalogue from Sunlight Labs - possible platform for intl aid data registry?

Joe Pringle jpringle at forumone.com
Wed Feb 24 21:34:20 UTC 2010


Great stuff Jonathan -

Obviously, I am a little bit out of the loop on what has been happening on
this front :). Either way, on a related topic we also happen to be doing a
totally separate project right now focused on developing a data catalogue,
and as part of that are mapping out a metadata schema.  Based on this, we
could probably contribute to your question below about what metadata a
catalogue of data sets should include, and might have some ideas about the
user interface as well; which will be important as these repositories
scale to include thousands of data sets.

Joe

-----Original Message-----
From: okfn.jonathan.gray at googlemail.com
[mailto:okfn.jonathan.gray at googlemail.com] On Behalf Of Jonathan Gray
Sent: Wednesday, February 24, 2010 10:53 AM
To: Joe Pringle
Cc: open-development; simon at devinit.org; Cormac Nolan; Rufus Pollock
Subject: Re: Data catalogue from Sunlight Labs - possible platform for
intl aid data registry?

We have been speaking with Sunlight about compatibility with CKAN, the
Open Knowledge Foundation's fully open source registry of open data
that used in data.gov.uk. We are currently working on rolling it out
for data catalogues in several other countries around the world -
including France, Germany, Canada, and Norway.

CKAN has been in active development for around 4+ years and currently
offers an array of features (some quite sophisticated) for finding and
working with open data. These include:

  * Free/Open-Source software, written in Python
  * Domain Model: Data and content "packages" with a standard set of
core metadata and support for adding unlimited arbitrary additional
metadata
    * All package data is automatically versioned in a wiki-like manner
    * Tagging of packages
    * Groups for controlled categorization of packages
    * Ratings
    * Unlimited associated package resources ('download urls') with
additional metadata (format, description etc)
  * Web user interface (WUI)
    * Package adding, editing, listing etc
    * Wiki features such as "Recent Changes", edit histories, purging
of changes etc
    * User management and user home pages
  * API: full JSON-based API (with python client)
    * RDF version also available
    * CKAN is easy to use as your "catalogue" backend
  * Search: Full searchability (including full-text support) via API and
WUI
  * Access control: fine-grained access control for packages and group
  * Additional interfaces:
    * Excel importer (upload dataset/package information direct from a
spreadsheet)
    * Fully featured command line client (datapkg)

Further details are at:

  http://knowledgeforge.net/ckan/trac
  http://ckan.net/

We are currently working on federation between different instances.
There is also Drupal integration to support straightforward re-theming
and front end customisation. We're also working with Semantic
Web/Linking Open Data community to improve support for semantic web
technologies.

For recent meeting we held to discuss different catalogue projects
around the world, see:


http://blog.okfn.org/2010/02/09/interested-in-making-an-open-data-catalogu
e-virtual-meeting-on-11th-february-2010/

I think our big questions for development community would be:

  * What metadata does the registry need to include?
  * How can we build community of people around this (part of this
we're particularly thinking about in relation to Aid Information
Challenge)
  * What front end/user features would people like to see?

We've started a international development group at:

  http://ckan.net/group/international-development

If anyone's interested in contributing to any of this we'd *love* to
hear from you!

All the best,

Jonathan


On Wed, Feb 24, 2010 at 4:01 PM, Joe Pringle <jpringle at forumone.com>
wrote:
> Hi Everyone -
>
> I was in a meeting yesterday with the folks at Sunlight Labs, which is a
> driving force behind the open data movement in the US, and they are
> releasing a National Data Catalogue (http://nationaldatacatalog.com/ -
> note it is still in beta) to provide a more robust alternative to
> Data.gov.  Their goal is to open up the data submission beyond just US
gov
> agencies, build communities around data, make it more user friendly, AND
> perhaps most important, make it a platform that third parties and
> developers can more easily build apps on and convert data among
different
> formats.
>
> The reason this is interesting is they are sharing the source code, and
> are interested in other groups leveraging it for similar efforts.  It is
> built on the Ruby on Rails platform, and presumably we could modify it
to
> support an international registry of international development datasets
> (e.g. change the metadata schema, interface, etc).
>
> I know there might already be efforts on this underway, but this is a
> $500k platform from one of the most capable open source development
shops
> I am aware of working on this issue.
>
> Would there be any interest in a conversation with these guys about
if/how
> we might be able to build on this?
>
> If so let me know and I can help facilitate.
>
> Joe
>
>
> --------------------------------------
> Joe Pringle
> Forum One Communications
> 703-894-4330
> www.forumone.com
>



--
Jonathan Gray

Community Coordinator
The Open Knowledge Foundation
http://blog.okfn.org

http://twitter.com/jwyg
http://identi.ca/jwyg




More information about the open-development mailing list