[ckan-dev] csw harvesting problems

Hildegard Gerlach hildegard.gerlach at jrc.ec.europa.eu
Wed Jul 3 08:06:59 UTC 2013


Hello,

Im am using version 2.0.
I am trying to harvest from a csw catalogue and getting the following error

2013-07-02 18:14:54,886 INFO [ckanext.spatial.harvesters.csw.CSW.gather] 
Got identifier f8bc13ca-63b2-11e2-ac88-001999c24c1e from the CSW
2013-07-02 18:14:54,886 INFO [ckanext.spatial.harvesters.csw.CSW.gather] 
Got identifier 14fda267-6da4-4024-bc6f-8bad1c0bf249 from the CSW
2013-07-02 18:14:54,887 INFO [ckanext.spatial.lib.csw_client] Making CSW 
request: getrecords {'outputschema': 'http://www.isotc211.org/2005/gmd', 
'startposition': 60, 'typenames': 'csw:Record', 'maxrecords': 10, 
'keywords': [], 'esn': 'brief', 'qtype': None}
2013-07-02 18:15:04,905 ERROR 
[ckanext.spatial.harvesters.csw.CSW.gather] Exception: Traceback (most 
recent call last):
   File 
"/usr/local/mtester.ies.jrc.it/pyenv/src/ckanext-spatial/ckanext/spatial/harvesters/csw.py", 
line 90, in gather_stage
     for identifier in self.csw.getidentifiers(page=10):
   File 
"/usr/local/mtester.ies.jrc.it/pyenv/src/ckanext-spatial/ckanext/spatial/lib/csw_client.py", 
line 110, in getidentifiers
     csw.getrecords(**kwa)
   File 
"/usr/local/mtester.ies.jrc.it/pyenv/lib/python2.6/site-packages/owslib/csw.py", 
line 233, in getrecords
     self._invoke()
   File 
"/usr/local/mtester.ies.jrc.it/pyenv/lib/python2.6/site-packages/owslib/csw.py", 
line 473, in _invoke
     self.response = util.http_post(self.url, self.request, self.lang, 
self.timeout)
   File 
"/usr/local/mtester.ies.jrc.it/pyenv/lib/python2.6/site-packages/owslib/util.py", 
line 185, in http_post
     up = urllib2.urlopen(r,timeout=timeout);
   File "/usr/lib64/python2.6/urllib2.py", line 126, in urlopen
     return _opener.open(url, data, timeout)
   File "/usr/lib64/python2.6/urllib2.py", line 391, in open
     response = self._open(req, data)
   File "/usr/lib64/python2.6/urllib2.py", line 409, in _open
     '_open', req)
   File "/usr/lib64/python2.6/urllib2.py", line 369, in _call_chain
     result = func(*args)
   File "/usr/lib64/python2.6/urllib2.py", line 1190, in http_open
     return self.do_open(httplib.HTTPConnection, req)
   File "/usr/lib64/python2.6/urllib2.py", line 1165, in do_open
     raise URLError(err)
URLError: <urlopen error timed out>
2013-07-02 18:15:04,943 ERROR [ckanext.harvest.harvesters.base] Error 
gathering the identifiers from the CSW server [<urlopen error timed out>]
2013-07-02 18:15:04,948 ERROR [ckanext.harvest.queue] Gather stage failed

I ran the different commands to start the harvesting as described here
https://github.com/okfn/ckanext-harvest/tree/release-v2.0#running-the-harvest-jobs

I am not sure about the exact syntax in specifying the csw server

Does the URL has to end with
/csw-drdsi
or
csw-drdsi?SERVICE=CSW&VERSION=2.0.2&REQUEST=GetCapabilities

or anything else ?

But the 2 above seem to give the same error message.

Any help appreciated.

Thanks

Hilde





More information about the ckan-dev mailing list