[okfn-discuss] Fw: [Geodata] discoverability and the wiki

Rufus Pollock rufus.pollock at okfn.org
Sat Oct 6 09:17:23 UTC 2007


Jo Walsh wrote:
> dear all,

This is really interesting Jo (and Aaron). At this point I feel honour 
bound to mention CKAN :) given that it was expressly designed for 
holding basic metadata for open knowledge/data projects and packages:

   http://www.ckan.net/

> i enjoyed this walkthrough of making a semi-structured metadata registry
> with semantic mediawiki, this one in the context of a distributed geodata 
> repository. Thanks for writing this, Aaron. I am never sure about the
> amount of cognitive load such a detailed syntax would impose on
> potential contributors. But if one is committed, it is better than

This was precisely one the reasons for going for a more web-app type 
approach on CKAN (that plus the desire to do 'full' versioning of data 
and the fact we got started before things like SMW were available ...).

One of the things we want to do asap with CKAN is add support for 
plugins that will allow people to add extra metadata in specific subject 
areas (of course one could also write a simple extension to allow 
'arbitrary' metadata to be added but often having some constraints are 
useful -- if someone is entering data related to shakespeare they don't 
necessarily want to be asked about long/lat extents).

> just having notes on a wiki. It seems also to be an inverted version
> of the old public domain works wiki, initially generated by a dump
> from a structured source. 

Jo (in particular) I note that you've already got some listing of data 
sources and listings on:

http://wiki.osgeo.org/index.php/Geodata_Discovery_Working_Group

specifically at:

<http://wiki.osgeo.org/index.php/Geodata_Discovery_Working_Group#Existing_.28Meta.29_Search_Projects_and_Related_Efforts>

Would it be possible to add CKAN to the list of data catalogues there? 
In addition I wonder if you could be persuaded :) to add a few of these 
items into CKAN, we already have some geo related material:

http://www.ckan.net/tag/search?search_terms=geo
http://www.ckan.net/tag/read/geodata

but it would be good to get more. More comments on Aaron's excellent 
efforts inlined below.

~rufus

PS: since I'm not on the geo at lists.osgeo.org would you mind forwarding 
it there. I'd be very interested to get further responses in this thread 
  ...

> ----- Forwarded message from Aaron Straup Cope <asc at spum.org> -----
> 
> Date: Fri, 05 Oct 2007 06:29:36 -0700
> From: Aaron Straup Cope <asc at spum.org>
> To: geodata at lists.osgeo.org
> 
> Hellos,
> 
> I recently attended FOSS4G, in Victoria, and stopped in during the open 
> geodata BOF.
> 
> One of the issues people raised was how to organize and find all of the 
> possible data that may be housed on osgeo servers.
> 
> Since there is already a working instance of Mediawiki I wondered aloud 
> whether something like the Semantic MediaWiki (SMW) extensions would be 
> useful.
> 
> 	http://meta.wikimedia.org/wiki/Semantic_MediaWiki
> 
> Let me pause briefly to just say : 1) I don't really like wikis either 
> and 2) I am not going to rain on everyone's parade with pedantic semweb 
> hocus pocus. No, really.
> 
> But.
> 
> The SMW stuff does make it pretty easy to add just that little bit of 
> extra data so that you aren't living and dieing by full-text search 
> alone and MW templates, once you suffer the initial setup, make it 
> possible to mostly hide all of the hard stuff.
> 
> Both are still fraught with their own ongoing issues but they save 
> people from having to write something from scratch and it's a reasonable 
> 80/20 solution to the problem of making easy enough to bother entering 
> data but detailed enough to make it worth getting it back out again.
> 
> Maybe.
> 
> Eventually someone said : It's sounds like you're volunteering. At which 
> point it became bad form not to at least put together a proof of concept.

That's the way to go :)

> So here it is, with details (and bugs) below : http://proj.spum.org/

Fast work!

> (Also : I am not wed to any of this and I offer it up only as a 
> suggestion. This is all stuff that I am interested in beyond any needs 
> to index and discover open geodata so I'm not going to take my toys and 
> leave if people decide it doesn't fit their needs.)

I do think there is a distinction here between creating an (inevitably 
limited but perhaps higher quality) registry and discovery. For example 
you have Freshmeat to list open source projects but not necessarily all 
projects are on there. At present I still think we need to have some 
kind of registry because random discovery (using RDF tags in pages or 
some kind of microformats) out in the wild is just too, well 'random' 
and the metadata quality is too low -- that's why we're developing CKAN. 
In the long run this may change (just doing CKAN i'm already amazed at 
the amount of material ...)

> ---
> 

[snip]

> ---
> 
> Here's a "complicated" example :
> 
> # http://www.proj.spum.org/index.php?title=SomeProject
> 
> {{Project|Bob Exampolopolis|Mr. Nubby}}
> 
> == Description ==
> 
> This is a fuzzy project!
> 
> {{Tags|fuzzy|dice|muffins}}
> 
> == Meta ==
> 
> {{meta|dc|coverage|foo}}
> 
> # http://www.proj.spum.org/index.php?title=SomeProject_0.9
> {{ProjectRelease|2007-09-01|cc-by-3.0}}

This is remarkably similar to the basic metadata of CKAN -- Great minds 
think alike ;-)

> ---
> 
> In the example above tags actually get added as "dc subject" properties 
> (as well as categoties) with all the work being hidden in the Tags template.
> 
> The Meta template is just a more general way to add domain specific 
> data. Prefixes, like dc, can be registered in SMW such that they are 
> recognized and expanded to proper URLs.
> 
> ---
> 
> Out of the box, SMW will let you search by properties. For example :

[snip]

> And, yes, the {{for|call}} stuff (well, actually, all of it) is a little 
> like stabbing yourself in the eyes. That's why you hide it all in templates.
 >
> The <ask> stuff works great where it works. And not so much where it 
> doesn't. For example :

[snip]

> Have a poke around. If you're feeling brave follow some of the templates 
> but you may want to cry. If you're interested in playing with a related 
> project there is also :
> 
> 	http://grape.spum.org/
> 
> This one has a More Better (tm) search interface/API but only because I 
> started to abuse the actual SMW source code. They have since updated 
> things and I can't face whatever changes I'll need to make as a result...
> 
> 	http://grape.spum.org/pages/HowToSearch

This was one of the reasons we stuck with the pure webapp approach 
(python + pylons) for CKAN -- by this point MW (or even SMW) were really 
being using in an webapp type way. Given that they weren't really 
designed for this our worry was that while you could go pretty fast at 
the start you were likely to suddenly hit a serious inflection point at 
some point.




More information about the okfn-discuss mailing list