[ckan-dev] Geonode CWI integration

Adrià Mercader adria.mercader at okfn.org
Wed Mar 12 10:46:27 UTC 2014


Hi Reinier,

I'm going to go with Philippe and assume you are talking about CSW services :)

The issues regarding the format detection for remote resources came up
right from the start of working with ISO-based CSW harvesting and have
been a pain ever since. For all its zilion fields and lengthy
standard, it's not an easy task to infer them from a harvested ISO
record.

For CKAN to handle the previews correctly we will need to assign a
correct format to them, eg png to the png images, geojson to geojson
files, wms to wms endpoints, etc. Now, let's take the example you
point to, each of the online resources looks like this:


<gmd:onLine>
<gmd:CI_OnlineResource>
<gmd:linkage>
<gmd:URL>
http://maps.nemaug.org/geoserver/wms?layers=geonode%3Awildlifesanctuaries&width=469&bbox=29.84592360669541%2C-0.14586631603160494%2C33.23719845518447%2C3.8276767112473573&service=WMS&format=application%2Fpdf&srs=EPSG%3A4326&request=GetMap&height=550
</gmd:URL>
</gmd:linkage>
<gmd:protocol>
<gco:CharacterString>WWW:DOWNLOAD-1.0-http--download</gco:CharacterString>
</gmd:protocol>
<gmd:name>
<gco:CharacterString>wildlifesanctuaries.pdf</gco:CharacterString>
</gmd:name>
<gmd:description>
<gco:CharacterString>Wildlife sanctuaries (PDF Format)</gco:CharacterString>
</gmd:description>
</gmd:CI_OnlineResource>
</gmd:onLine>

Name and description contain pdf, and there's also the format
parameter in the WMS url, but these are there just because we are
harvesting from a GeoNode instance.
The most reliable way we found of guessing what the online resources
were actually pointing at was trying to guess it from the url and file
extension, looking for common patterns [1], which is a bit limited as
your case shows (resources are flagged as wms or wfs and not the
actual output format).

I'm not sure of the best way to move forward with this, and to what
extent the excellent work started by Tom and others around catalog
interop [2] aims to address this (I need to catch up on that).

@Tom could the applicationProfile of CI_OnlineResource be used for
defining the expected resource format?

We could add some more magic to the logic for guessing the resource
format so name or format param were taken into account in GeoNode's
case


@Reinier note that you can customize the harvested datasets that will
be created in CKAN, tweaking the dict that will be sent to the
create/update functions to manually set formats, remove unwanted ones,
etc. [3]
Right now you need to extend the base CSW harvester, but I'm working
in a couple of extension points right now that should make this
easier.

Happy to continue the discussion on whatever list/channel

Adrià

[1] https://github.com/ckan/ckanext-spatial/blob/master/ckanext/spatial/harvesters/base.py#L58
[2] https://github.com/OSGeo/Cat-Interop
[3] http://ckanext-spatial.readthedocs.org/en/latest/harvesters.html#customizing-the-harvesters

On 12 March 2014 09:55, Philippe Duchesne <pduchesne at gmail.com> wrote:
> Hello Reinier,
>
> what do you call CWI ? do you mean CSW ?
>
> --p.
>
>
> On Wed, Mar 12, 2014 at 9:37 AM, Reinier Battenberg
> <reinier.battenberg at mountbatten.net> wrote:
>>
>> Hi,
>>
>> In our setup ( www.data.ug ) we run geonode for our geospatial data, and
>> CKAN
>> for everything + our geospatial data. With a CMS of choice on top of that,
>> this makes a pretty nice Open Data architecture.
>>
>> We are already harvesting 2 geonodes (one is not our own) and solving the
>> issues that come with that.
>>
>> One issue is that the CWI that geonode produces results in pretty useless
>> resources in CKAN. eg. http://catalog.data.ug/dataset/wildlife-sanctuaries
>> Note that none of the previews work.
>>
>> CWI seems to be a flexible standard that can produce descriptions of
>> datasets
>> differently, so  we are having a discussion on the geonode mailinglist as
>> to
>> how to change the CWI that geonode produces.
>>
>> It would be great if Adria and others who are interested could join the
>> discussion, so the outcome of the changes to geonode would be an easier
>> and
>> slicker integration between these 2 great tools.
>>
>> The issue is here:
>> https://groups.google.com/forum/#!topic/geonode-users/ueNTQ0DWl9Y
>>
>>
>> --
>> rgds,
>>
>> Reinier Battenberg
>> Director
>> Mountbatten Ltd.
>> www.mountbatten.net
>> tel: +256 758 801749
>> twitter: @batje
>> _______________________________________________
>> ckan-dev mailing list
>> ckan-dev at lists.okfn.org
>> https://lists.okfn.org/mailman/listinfo/ckan-dev
>> Unsubscribe: https://lists.okfn.org/mailman/options/ckan-dev
>
>
>
> _______________________________________________
> ckan-dev mailing list
> ckan-dev at lists.okfn.org
> https://lists.okfn.org/mailman/listinfo/ckan-dev
> Unsubscribe: https://lists.okfn.org/mailman/options/ckan-dev
>



More information about the ckan-dev mailing list