[ckan-dev] CSW Harvesting from GeoPortal and GeoNetwork

Tom Kralidis tomkralidis at hotmail.com
Thu Nov 8 02:49:58 UTC 2012



> Date: Wed, 7 Nov 2012 15:46:39 +0000
> From: adria.mercader at okfn.org
> To: ckan-dev at lists.okfn.org
> Subject: Re: [ckan-dev] CSW Harvesting from GeoPortal and GeoNetwork
> 
> Hi,
> 
> Thanks all for all the feedback and pointers.
> 
> Before answering specific issues, let me stress again what I mentioned
> in earlier threads: support for harvesting generic CSW sources is a
> feature currently under development and bound to need some significant
> work to get it working as is. It hasn't been thoroughly tested and the
> harvester code can be complex, with errors returned are sometimes
> obscure, both these things we are working to improve (literally right
> now). As more different servers are tested, each with its own issues,
> more problems will no doubt surface, so it will take some time until
> this is stable enough for production.
> 
> Ryan, David, Tom see comments below:
> 
> >> > Am 07.11.2012 2:40, schrieb Ryan Hodges:
> >> > And in my config file:
> >> >
> >> > ckan.spatial.validator.profiles = iso19139
> 
> >> > url: http://apps.who.int/geonetwork/srv/csw
> >> >
> >> > status: Gathering errors
> >> >
> >> > - Error contacting the CSW server: '2.0.2'
> >> >
> >> > Harvested: 0
> >> >
> >> > WHY: Server is at version 2.0.1. Is harvesting from this version not
> >> > available?
> CKAN relies on OWSLib for querying the CSW servers, and if as Tom
> mentioned, it only supports CSW 2.0.2, that's what CKAN will support.
> 
> 
> >> > url: http://www.fao.org/geonetwork/srv/en/csw
> >> >
> >> > status: Object errors
> >> >
> >> > - GUID 6fed4955-c0f4-49e6-aaf2-9475504dc6bc
> >> >
> >> > - Validating against "ISO19139 XSD Schema" profile failed:
> >> >
> >> > - Dataset schema (gmx.xsd) Validation Error: (u"Element
> >> > '{http://www.isotc211.org/2005/gmd}MD_SatelliteSpatialRepresentation':
> >> > This
> >> > element is not expected. Expected is one of (
> >> > {http://www.isotc211.org/2005/gmd}AbstractMD_SpatialRepresentation,
> >> > {http://www.isotc211.org/2005/gmd}MD_GridSpatialRepresentation,
> >> > {http://www.isotc211.org/2005/gmd}MD_VectorSpatialRepresentation,
> >> > {http://www.isotc211.org/2005/gmd}MD_Georeferenceable,
> >> > {http://www.isotc211.org/2005/gmd}MD_Georectified )., line 74",)
> >> >
> >> > Harvested: 10
> >> >
> >> > Why: As the error says, it doesn’t recognize
> >> > MD_SatelliteSpatialRepresentation. Another error included a bad
> >> > date-time.
> This is just a validation error for this document, which does not
> adhere to the ISO 19193 XSD validation. The current harvesting
> implementation will not prevent the package from being created, but
> you can try other ISO validation profiles to see if the errors go
> away. There are profiles for the NGDC (iso19139ngdc) and EDEN
> (iso19139eden) schemas available.
> 
> 
> >> > NOTE: This didn’t work at first: the directory for ISO19139 validation
> >> > did
> >> > not exist in the python 2.6 egg:
> >> >
> >> >
> >> > ‘site-packages/ckanext_spatial-0.2-py2.6.egg/ckanext/spatial/validation/xml’
> >> > <- Does not exist
> >> >
> >> > I had to copy it in manually.
> How did you install ckanext-spatial? Was it from sources?
> 
> 
> >> > url: http://gptogc.esri.com/geoportal/csw
> >> >
> >> > status: Gathering errors
> >> >
> >> > - Error gathering the identifiers from the CSW server ['NoneType'
> >> > object has no attribute 'find']
> I'm not sure if this is exactly the same issue, but I found a similar
> one that needed fixing on OWSLib [1].
> Feel free to try my fork with the changes to see if that's solves the issue
> 
> 

Thanks for the pull request, this has now been integrated in OWSLib master.

> 
> 
> >> > url: http://geo.data.gov/geoportal/csw
> >> >
> >> > status: Gathering errors
> >> >
> >> > - Error gathering the identifiers from the CSW server ['\nAn
> >> > exception occurred with no applicable code\n']
> For some reason, the remote server returns an exception while
> gathering the indentifiers. We'd need to investigate further on this.
> 
> 
> On 7 November 2012 10:24, David Read <david.read at hackneyworkshop.com> wrote:
> > Ryan,
> >
> > I see from the code that "Error gathering the identifiers from the CSW
> > server" is a problem with calling OWSLib's "getidentifiers" method,
> > which is the first time we use this library to call the CSW server.
> Just to be clear, getindentifiers is a CKAN method (csw_client.py)
> which in turns calls OWSLib's csw.getrecords.
> 
> 
> > The version of OWSLib that ckanext-spatial tries to use is not the
> > original SVN repo, but an OKF branch here:
> > https://github.com/okfn/owslib
> FYI, all the latest work done on harvesting already targets the latest
> version of OWSLib (ie not the okfn one)
> 
> 
> 
> On 7 November 2012 13:13, Tom Kralidis <tomkralidis at hotmail.com> wrote:
> 
> >
> > Testing each of those CSWs with OWSLib directly resulted in no errors when
> > testing with 0.5.1 in a virtualenv; however I am making rudimentary
> > requests.
> >
> See previous comments on where OWSLib may be involved.
> 
> 
> >
> > CSW 2.0.2 (despite the .z bumps) is very different from 2.0.1 and 2.0.0.
> > OWSLib's CSW support is for 2.0.2 only.
> That's really useful to know.
> 
> 
> > I'm not familiar with the CKAN code on top of OWSLib.  Can anyone point to
> > the code, or a trace of what the context of the issue is?  I'm willing to
> > hunt down the cause of the issue here.
> Thanks, your thoughts on [1] would be great.
> 
> 
> Hope this helps a bit
> 
> Adrià
> 
> 
> 
> [1] https://github.com/geopython/OWSLib/pull/40
> [2] https://github.com/amercader/OWSLib
> 
> _______________________________________________
> ckan-dev mailing list
> ckan-dev at lists.okfn.org
> http://lists.okfn.org/mailman/listinfo/ckan-dev
> Unsubscribe: http://lists.okfn.org/mailman/options/ckan-dev
 		 	   		  
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.okfn.org/pipermail/ckan-dev/attachments/20121107/f56c3d48/attachment-0001.html>


More information about the ckan-dev mailing list