[okfn-help] International development data on CKAN

David Read david.read at okfn.org
Wed Dec 9 12:14:34 UTC 2009


Ben,

I'd be interested in what Rufus has to say on this, because this is
quite fundamental to where CKAN is going, but here are my thoughts.

The idea of the metadata in CKAN is to help users find datasets and to
help linking of datasets. I think many of the fields you have here are
useful in going into CKAN, and perhaps others are best left associated
with the data. Here are some examples, based on your example dataset.

I can envision someone finding it because they were looking for some
data to do with development finance, or to do with Cambodia, or just
recent data, for example.

Browsing metadata in CKAN they might see the opportunity to examine
the link between ODA events in Cambodia and political events and
propose plotting amount of money in ODA against time and include
events in Cambodia from Microfacts. So in CKAN it is useful to see the
temporal coverage of the ODA data, that the fields contain budget
information (rather than just a vague description) and that the can be
got at in XML (not something difficult like PDF) and that the license
is compatible with Microfacts.

You have fields to do with provenance and a lot of
development-specific information, which is great, but not the sort of
thing that CKAN is best at indexing. I can see someone wanting to find
all datasets which are 'humanitarian aid' but not 'development aid' or
'compliant with DAC standards', which can all be tagged and textually
searched in CKAN, but I expect you would want a custom development
search for these fields.

If my assumptions are reasonable then it suggests to me that you
should have your customised IDD  database / website, with each record
having the key points synced into a CKAN record. Does anyone else want
to comment?

David


2009/12/9 Ben Harden <b.e.harden.03 at cantabgold.net>:
> Hi David, Rufus,
>
> Thanks for the feedback on the questions Jonathan posted. For reference,
> here is a link to the first draft of the IDD fields.
>
> http://spreadsheets.google.com/ccc?key=0AnHh6dpmBwS7dFVlcFZzWV9yVG8tUURRckVDMVo3Q2c&hl=en
>
> This'll still need some cleaning up and standardization, but it's a start!
> Not sure which is going to be the best option yet (building in CKAN or
> making the database consistent with CKAN)- will probably need some more
> assistance on this point in the near future...
>
> Thanks, all the best,
>
> Ben
>
> David Read wrote:
>>
>> Ben,
>>
>> Pleased to meet you! It sounds excellent to get the IDD onto CKAN.
>> Here are some pointers (see below), but feel free to ask more. All the
>> best,
>>
>> David
>>
>>
>>>
>>> It would be great if this could be done either via CKAN (and then
>>> published on an external website with basic querying functionality),
>>> or at least published in a form that meant copies of the profiles
>>> could go onto CKAN.
>>>
>>
>> Yes, do link up with our API data to get data into and out of CKAN. If
>> you've not already, see: http://ckan.net/api/
>>
>>
>>>
>>>  * Currently we have several fields that it would be good to import
>>> into CKAN preserving some structure (rather than just adding to free
>>> text field). I understand CKAN can now support arbitrary metadata. Is
>>> this the same as the key/value pairs?
>>>
>>
>> Yes, see the 'extras' field in a package. The key and value is free
>> text. Structure the value field as you see fit.
>>
>>
>>>
>>> If so can we have more than the
>>> three that come up on web interface?
>>>
>>
>> You can use as many as you like. In the web interface, when you run
>> out of fields, hit 'preview' to get some more.
>>
>>
>>>
>>> It would be great to use this
>>> functionality for our data profiles. For example, we have fields with
>>> dates and associated country names - which are probably fairly generic
>>> fields. Some have many country names associated with them. Perhaps we
>>> could use appropriate ISO values?
>>>
>>
>> We're currently evaluating using Ordnance Survey ontologies but
>> geography in the UK (with the current focus on UK government datasets)
>> but not clear yet on what to use abroad.
>>
>> Rufus mentioned you might be interested in temporal fields and it
>> certainly looks like we'll have some for the government data.
>>
>>
>>>
>>>  * In some areas changes to our model would be quite simple. E.g. we
>>> currently have a 'contact information' field. We could create a
>>> separate email address field which could correspond to the 'owner
>>> email ' field or 'maintainer email' field in CKAN. What is distinction
>>> between owner/maintainer here?
>>>
>>
>> Our intention is that 'Author' is the original creator of the data. If
>> the Maintainer is different to the Author then supply Maintainer too.
>>
>>
>>>
>>>  * Regarding license field, we currently have 'OKD compliant' as
>>> Yes/No field. Would probably better to use something to correspond
>>> with drop down menu in CKAN menu. Would a number be best here? If so,
>>> is there a list that we can use to link numbers to items in the menus?
>>>
>>
>> There is an ID for every license (license_id is there but undocumented
>> in the API), but I fear it's best not to make a dependency on that. So
>> I suggest just using the text name of the license.
>>
>>
>>>
>>>  * Generally, I wonder whether it would be worth looking to existing
>>> standards and guidance for this. Especially where fields may be
>>> generic. It would be great to ensure the fields in our profiles comply
>>> with standards, where standards exist. I wonder whether this has been
>>> thought about in relation to government data? Should we be looking to
>>> Dublin Core? Are there other metadata standards we should examine?
>>>
>>
>> A few weeks ago we copied ckan.net packages to an RDF store and you
>> can look at the ontologies used here:
>>
>> http://api.talis.com/stores/ckan/meta?about=http%3A%2F%2Fckan.net%2Fpackage%2Frdf%2F32000-naples-florida-businesses-kml
>> This is still experimental, so things could well change soon.
>>
>
>




More information about the okfn-help mailing list