[ckan-dev] csw harvesting
Philippe Duchesne
pduchesne at gmail.com
Tue Apr 8 12:30:59 UTC 2014
Can you share the URL to your CSW, or its capabilities document ?
--p.
On Tue, Apr 8, 2014 at 2:11 PM, Hildegard Gerlach <
hildegard.gerlach at jrc.ec.europa.eu> wrote:
> Dear all,
>
> I have a problem harvesting from a csw server. I get the following error
> message
>
> 2014-04-08 11:43:27,063 ERROR [ckanext.spatial.harvesters.csw.CSW.gather]
> Exception: Traceback (most recent call last):
> File "/usr/local/ckan/pyenv/src/ckanext-spatial/ckanext/spatial/harvesters/csw.py",
> line 95, in gather_stage
> for identifier in self.csw.getidentifiers(page=10,
> outputschema=self.output_schema(), cql=cql):
> File "/usr/local/ckan/pyenv/src/ckanext-spatial/ckanext/spatial/lib/csw_client.py",
> line 120, in getidentifiers
> csw.getrecords2(**kwa)
> File "/usr/local/ckan/pyenv/lib/python2.6/site-packages/owslib/csw.py",
> line 343, in getrecords2
> self._invoke()
> File "/usr/local/ckan/pyenv/lib/python2.6/site-packages/owslib/csw.py",
> line 611, in _invoke
> raise RuntimeError, 'Document is XML, but not CSW-ish'
> RuntimeError: Document is XML, but not CSW-ish
> 2014-04-08 11:43:27,081 ERROR [ckanext.harvest.harvesters.base] Error
> gathering the identifiers from the CSW server [Document is XML, but not
> CSW-ish]
> 2014-04-08 11:43:27,095 ERROR [ckanext.harvest.queue] Gather stage failed
>
>
> I think the problem is in the GetCapabilities of the csw server which has
> <ows:Operation name="Harvest">
>
> while the other csw servers have this part commented.
> <!--
> <ows:Operation name="Harvest">
> <ows:DCP>
> <ows:HTTP>
> <ows:Get xlink:href="http://$HOST:$PORT$SERVLET/srv/en/csw"
> />
> <ows:Post xlink:href="http://$HOST:$PORT$SERVLET/srv/en/csw"
> />
> </ows:HTTP>
> </ows:DCP>
> </ows:Operation>
> -->
>
> Looking into owslib/csw.py I can see the following:
>
> # parse result see if it's XML
> self._exml = etree.parse(StringIO.StringIO(self.response))
>
> # it's XML. Attempt to decipher whether the XML response is
> CSW-ish """
> valid_xpaths = [
> util.nspath_eval('ows:ExceptionReport', namespaces),
> util.nspath_eval('csw:Capabilities', namespaces),
> util.nspath_eval('csw:DescribeRecordResponse', namespaces),
> util.nspath_eval('csw:GetDomainResponse', namespaces),
> util.nspath_eval('csw:GetRecordsResponse', namespaces),
> util.nspath_eval('csw:GetRecordByIdResponse', namespaces),
> util.nspath_eval('csw:HarvestResponse', namespaces),
> util.nspath_eval('csw:TransactionResponse', namespaces)
> ]
>
> if self._exml.getroot().tag not in valid_xpaths:
> raise RuntimeError, 'Document is XML, but not CSW-ish'
>
>
> but harvest is included, so I don't understand why it creates problems.
>
> I am using OWSLIB 0.8.6
>
> Any help appreciated.
>
> Thanks
>
> Hilde
> _______________________________________________
> ckan-dev mailing list
> ckan-dev at lists.okfn.org
> https://lists.okfn.org/mailman/listinfo/ckan-dev
> Unsubscribe: https://lists.okfn.org/mailman/options/ckan-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.okfn.org/pipermail/ckan-dev/attachments/20140408/604a2735/attachment-0003.html>
More information about the ckan-dev
mailing list