No subject


Fri Nov 6 12:25:30 GMT 2009


Also great opportunity for CKAN to be home to registry of
international development data in medium to long term...

Jonathan

On Wed, Dec 9, 2009 at 1:24 PM, Jonathan Gray <jonathan.gray at okfn.org> wrot=
e:
> David,
>
> Many thanks for your reply. This is exactly what I was thinking. We'll
> probably continue to develop the database externally (on Google Docs
> for now!) but would be great to make sure it can be imported to CKAN.
>
> Also thought it would be a good test of CKAN's support for arbitrary
> metadata. (E.g. development specific stuff.) In longer term there is
> potential that CKAN could be main home for registry of open data on
> international development, and its interesting to think how much it is
> built to give support to other domain-specific querying functonality.
> E.g. could we have a 'country' plugin built on ISO standards? Also, in
> the longer term will CKAN aim to support information such as what
> fields are available in a given dataset - or are there no plans to
> make it this fine-grained? Basically question is how flexible CKAN is
> going to be - and how much it will be possible to drill down and do
> domain specific querying.
>
> Another question moving forward is do we have 'main' version of our
> database on CKAN and push to another site to publish - or do we have
> main version on another site and pull to CKAN as necessary?
>
> Jonathan
>
> On Wed, Dec 9, 2009 at 12:14 PM, David Read <david.read at okfn.org> wrote:
>> Ben,
>>
>> I'd be interested in what Rufus has to say on this, because this is
>> quite fundamental to where CKAN is going, but here are my thoughts.
>>
>> The idea of the metadata in CKAN is to help users find datasets and to
>> help linking of datasets. I think many of the fields you have here are
>> useful in going into CKAN, and perhaps others are best left associated
>> with the data. Here are some examples, based on your example dataset.
>>
>> I can envision someone finding it because they were looking for some
>> data to do with development finance, or to do with Cambodia, or just
>> recent data, for example.
>>
>> Browsing metadata in CKAN they might see the opportunity to examine
>> the link between ODA events in Cambodia and political events and
>> propose plotting amount of money in ODA against time and include
>> events in Cambodia from Microfacts. So in CKAN it is useful to see the
>> temporal coverage of the ODA data, that the fields contain budget
>> information (rather than just a vague description) and that the can be
>> got at in XML (not something difficult like PDF) and that the license
>> is compatible with Microfacts.
>>
>> You have fields to do with provenance and a lot of
>> development-specific information, which is great, but not the sort of
>> thing that CKAN is best at indexing. I can see someone wanting to find
>> all datasets which are 'humanitarian aid' but not 'development aid' or
>> 'compliant with DAC standards', which can all be tagged and textually
>> searched in CKAN, but I expect you would want a custom development
>> search for these fields.
>>
>> If my assumptions are reasonable then it suggests to me that you
>> should have your customised IDD =A0database / website, with each record
>> having the key points synced into a CKAN record. Does anyone else want
>> to comment?
>>
>> David
>>
>>
>> 2009/12/9 Ben Harden <b.e.harden.03 at cantabgold.net>:
>>> Hi David, Rufus,
>>>
>>> Thanks for the feedback on the questions Jonathan posted. For reference=
,
>>> here is a link to the first draft of the IDD fields.
>>>
>>> http://spreadsheets.google.com/ccc?key=3D0AnHh6dpmBwS7dFVlcFZzWV9yVG8tU=
URRckVDMVo3Q2c&hl=3Den
>>>
>>> This'll still need some cleaning up and standardization, but it's a sta=
rt!
>>> Not sure which is going to be the best option yet (building in CKAN or
>>> making the database consistent with CKAN)- will probably need some more
>>> assistance on this point in the near future...
>>>
>>> Thanks, all the best,
>>>
>>> Ben
>>>
>>> David Read wrote:
>>>>
>>>> Ben,
>>>>
>>>> Pleased to meet you! It sounds excellent to get the IDD onto CKAN.
>>>> Here are some pointers (see below), but feel free to ask more. All the
>>>> best,
>>>>
>>>> David
>>>>
>>>>
>>>>>
>>>>> It would be great if this could be done either via CKAN (and then
>>>>> published on an external website with basic querying functionality),
>>>>> or at least published in a form that meant copies of the profiles
>>>>> could go onto CKAN.
>>>>>
>>>>
>>>> Yes, do link up with our API data to get data into and out of CKAN. If
>>>> you've not already, see: http://ckan.net/api/
>>>>
>>>>
>>>>>
>>>>> =A0* Currently we have several fields that it would be good to import
>>>>> into CKAN preserving some structure (rather than just adding to free
>>>>> text field). I understand CKAN can now support arbitrary metadata. Is
>>>>> this the same as the key/value pairs?
>>>>>
>>>>
>>>> Yes, see the 'extras' field in a package. The key and value is free
>>>> text. Structure the value field as you see fit.
>>>>
>>>>
>>>>>
>>>>> If so can we have more than the
>>>>> three that come up on web interface?
>>>>>
>>>>
>>>> You can use as many as you like. In the web interface, when you run
>>>> out of fields, hit 'preview' to get some more.
>>>>
>>>>
>>>>>
>>>>> It would be great to use this
>>>>> functionality for our data profiles. For example, we have fields with
>>>>> dates and associated country names - which are probably fairly generi=
c
>>>>> fields. Some have many country names associated with them. Perhaps we
>>>>> could use appropriate ISO values?
>>>>>
>>>>
>>>> We're currently evaluating using Ordnance Survey ontologies but
>>>> geography in the UK (with the current focus on UK government datasets)
>>>> but not clear yet on what to use abroad.
>>>>
>>>> Rufus mentioned you might be interested in temporal fields and it
>>>> certainly looks like we'll have some for the government data.
>>>>
>>>>
>>>>>
>>>>> =A0* In some areas changes to our model would be quite simple. E.g. w=
e
>>>>> currently have a 'contact information' field. We could create a
>>>>> separate email address field which could correspond to the 'owner
>>>>> email ' field or 'maintainer email' field in CKAN. What is distinctio=
n
>>>>> between owner/maintainer here?
>>>>>
>>>>
>>>> Our intention is that 'Author' is the original creator of the data. If
>>>> the Maintainer is different to the Author then supply Maintainer too.
>>>>
>>>>
>>>>>
>>>>> =A0* Regarding license field, we currently have 'OKD compliant' as
>>>>> Yes/No field. Would probably better to use something to correspond
>>>>> with drop down menu in CKAN menu. Would a number be best here? If so,
>>>>> is there a list that we can use to link numbers to items in the menus=
?
>>>>>
>>>>
>>>> There is an ID for every license (license_id is there but undocumented
>>>> in the API), but I fear it's best not to make a dependency on that. So
>>>> I suggest just using the text name of the license.
>>>>
>>>>
>>>>>
>>>>> =A0* Generally, I wonder whether it would be worth looking to existin=
g
>>>>> standards and guidance for this. Especially where fields may be
>>>>> generic. It would be great to ensure the fields in our profiles compl=
y
>>>>> with standards, where standards exist. I wonder whether this has been
>>>>> thought about in relation to government data? Should we be looking to
>>>>> Dublin Core? Are there other metadata standards we should examine?
>>>>>
>>>>
>>>> A few weeks ago we copied ckan.net packages to an RDF store and you
>>>> can look at the ontologies used here:
>>>>
>>>> http://api.talis.com/stores/ckan/meta?about=3Dhttp%3A%2F%2Fckan.net%2F=
package%2Frdf%2F32000-naples-florida-businesses-kml
>>>> This is still experimental, so things could well change soon.
>>>>
>>>
>>>
>>
>
>
>
> --
> Jonathan Gray
>
> Community Coordinator
> The Open Knowledge Foundation
> http://www.okfn.org
>



--=20
Jonathan Gray

Community Coordinator
The Open Knowledge Foundation
http://www.okfn.org



More information about the okfn-help mailing list