[ckan-dev] CSW Harvesting from GeoPortal and GeoNetwork

Tom Kralidis tomkralidis at hotmail.com
Wed Nov 7 13:13:34 UTC 2012



> Date: Wed, 7 Nov 2012 10:24:02 +0000
> Subject: Re: [ckan-dev] CSW Harvesting from GeoPortal and GeoNetwork
> From: david.read at hackneyworkshop.com
> To: ckan-dev at lists.okfn.org
> CC: tomkralidis at hotmail.com
> 
> Ryan,
> 
> I see from the code that "Error gathering the identifiers from the CSW
> server" is a problem with calling OWSLib's "getidentifiers" method,
> which is the first time we use this library to call the CSW server. So
> trying different versions of OWSLib as Konrad mentions may help. The
> full exception gets written to the CKAN log, so you could let us know
> what that says.
> 
> The version of OWSLib that ckanext-spatial tries to use is not the
> original SVN repo, but an OKF branch here:
> https://github.com/okfn/owslib It appears to be based on 0.3 with a
> few tweaks for the servers we have been using. I see there are various
> tweaks in the code already to cope with some US gov servers, which
> suggests that the CSW world is not in complete agreement about the
> meaning of the specs...
> 

OWSLib is now on GitHub (http://github.com/geopython/OWSLib).  I would recommend OWSLib 0.5.1 (default pip currently).  Having said this, we are looking to cut 0.5.2 in the coming weeks.

> I'm interested to hear that 0.4 and 0.5 are out now, so would be good
> to try these. I suggest you try playing with OWSLib and your CSW
> servers on the command line, in a similar way to the README does for
> WMS servers:
> https://github.com/okfn/owslib/blob/master/README.txt
> 

Testing each of those CSWs with OWSLib directly resulted in no errors when testing with 0.5.1 in a virtualenv; however I am making rudimentary requests.

> The CSW part of OWSLib was done by Tom Kralidis, who I've copied on
> this, in case he can shed some light on the CSW versioning issue. I
> think that it calls the CSW server in version 2.0.2, but this should
> be compatible with 2.0.0 and 2.0.1. There appears to be a way to
> change the version if necessary. Guidance from Tom on this would be
> most helpful for all of us.
> 

CSW 2.0.2 (despite the .z bumps) is very different from 2.0.1 and 2.0.0.  OWSLib's CSW support is for 2.0.2 only.

I'm not familiar with the CKAN code on top of OWSLib.  Can anyone point 
to the code, or a trace of what the context of the issue is?  I'm willing to hunt down the cause of the issue here.


> David
> 
> On 7 November 2012 02:04, Konrad Reiche <konrad.reiche at gmail.com> wrote:
> > Hi Ryan,
> >
> > I had problems with the Error gathering the identifiers from the CSW server
> > ['NoneType' object has no attribute 'find']  as well. Here is what worked
> > for me:
> >
> > Check what version you are using with
> >
> > pip freeze | grep -i owslib
> >
> > For me OWSLib 0.4.0 works, so I suggest you uninstall your current OWSLib
> > installation with
> >
> > pip uninstall owslib
> >
> > and install the 0.4.0 version from the GitHub repository:
> >
> > pip install -e git+https://github.com/geopython/OWSLib.git@0.4.0#egg=OWSLib
> >
> > When I tred the latest version 0.5-dev the error stayed the same. I am using
> > CKAN 1.8
> > and Harvest + Spatial [latest] as well.
> >
> > Best,
> > Konrad
> >
> > Am 07.11.2012 2:40, schrieb Ryan Hodges:
> >
> > Hi all,
> >
> >
> >
> > I am trying to harvest spatial metadata using:
> >
> > Ckan [1.8]
> >
> > Ckanext-harvest [master]
> >
> > Ckanext-spatial [harvest-generic-iso]
> >
> >
> >
> > Python version 2.6
> >
> >
> >
> > Using the plugin:
> >
> > gemini_csw_harvester
> >
> >
> >
> > And in my config file:
> >
> > ckan.spatial.validator.profiles = iso19139
> >
> >
> >
> > With every site I try to harvest from, there seems to be another issue
> > preventing me from succeeding. I currently am not familiar enough with the
> > limitations of CSW harvesting to determine which of these are a shortcoming
> > of the source, which are a current limitation of CKAN harvesting, and which
> > are my own fault:
> >
> >
> >
> > When I harvest from a GeoNetwork site:
> >
> > ---------------------------------------------------------------
> >
> > url: http://apps.who.int/geonetwork/srv/csw
> >
> > status: Gathering errors
> >
> > -          Error contacting the CSW server: '2.0.2'
> >
> > Harvested: 0
> >
> > WHY: Server is at version 2.0.1. Is harvesting from this version not
> > available?
> >
> > ---------------------------------------------------------------
> >
> > url: http://www.fao.org/geonetwork/srv/en/csw
> >
> > status: Object errors
> >
> > -          GUID 6fed4955-c0f4-49e6-aaf2-9475504dc6bc
> >
> > -          Validating against "ISO19139 XSD Schema" profile failed:
> >
> > -          Dataset schema (gmx.xsd) Validation Error: (u"Element
> > '{http://www.isotc211.org/2005/gmd}MD_SatelliteSpatialRepresentation': This
> > element is not expected. Expected is one of (
> > {http://www.isotc211.org/2005/gmd}AbstractMD_SpatialRepresentation,
> > {http://www.isotc211.org/2005/gmd}MD_GridSpatialRepresentation,
> > {http://www.isotc211.org/2005/gmd}MD_VectorSpatialRepresentation,
> > {http://www.isotc211.org/2005/gmd}MD_Georeferenceable,
> > {http://www.isotc211.org/2005/gmd}MD_Georectified )., line 74",)
> >
> > Harvested: 10
> >
> > Why: As the error says, it doesn’t recognize
> > MD_SatelliteSpatialRepresentation. Another error included a bad date-time.
> >
> > NOTE: This didn’t work at first: the directory for ISO19139 validation did
> > not exist in the python 2.6 egg:
> >
> > ‘site-packages/ckanext_spatial-0.2-py2.6.egg/ckanext/spatial/validation/xml’
> > <- Does not exist
> >
> > I had to copy it in manually.
> >
> > ---------------------------------------------------------------
> >
> >
> >
> > When I harvest from a geoportal site:
> >
> > ---------------------------------------------------------------
> >
> > url: http://gptogc.esri.com/geoportal/csw
> >
> > status: Gathering errors
> >
> > -          Error gathering the identifiers from the CSW server ['NoneType'
> > object has no attribute 'find']
> >
> > Harvested: 0
> >
> > Why: ???
> >
> > ---------------------------------------------------------------
> >
> > url: http://geo.data.gov/geoportal/csw
> >
> > status: Gathering errors
> >
> > -          Error gathering the identifiers from the CSW server ['\nAn
> > exception occurred with no applicable code\n']
> >
> > Harvested: 0
> >
> > Why:???
> >
> > ---------------------------------------------------------------
> >
> >
> >
> > If anyone knows why these might be failing, what I could do to fix them, or
> > what might be fixed soon in the harvester to alleviate this, please respond.
> >
> >
> >
> > Thanks,
> >
> > Ryan Hodges | Applications Developer | Ecotrust
> >
> > 721 NW 9th Avenue, Suite 200 • Portland, OR 97209
> >
> > T (503) 467.0800 | F (503) 222.1517 | www.ecotrust.org
> >
> >
> >
> >
> >
> > _______________________________________________
> > ckan-dev mailing list
> > ckan-dev at lists.okfn.org
> > http://lists.okfn.org/mailman/listinfo/ckan-dev
> > Unsubscribe: http://lists.okfn.org/mailman/options/ckan-dev
> >
> >
> >
> > _______________________________________________
> > ckan-dev mailing list
> > ckan-dev at lists.okfn.org
> > http://lists.okfn.org/mailman/listinfo/ckan-dev
> > Unsubscribe: http://lists.okfn.org/mailman/options/ckan-dev
> >
 		 	   		  
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.okfn.org/pipermail/ckan-dev/attachments/20121107/aa5b4504/attachment-0001.html>


More information about the ckan-dev mailing list