[ckan-discuss] Stuck with CSW Harvesting
Angelos Tzotsos
gcpp.kalxas at gmail.com
Tue Sep 25 18:28:53 BST 2012
Hi all,
I am planning to start working on CKAN-pycsw integration in the near
future, hopping to end such problems mentioned here.
We released pycsw 1.4.0 3 weeks ago and now it is available from pypi as
a library (working now with wsgi).
I need to tackle the metadata model mapping in order to make it work.
Cheers,
Angelos
PS. https://github.com/geopython/pycsw/issues/73
On 09/25/2012 12:29 PM, Adrià Mercader wrote:
> Hi Bruce,
>
> I'm glad to hear that you are exploring using CKAN alongside CSW, this
> is something we want to improve and any feedback on this is greatly
> appreciated.
>
> As David already mentioned, the CSW related extensions were written in
> the context of the UK Location Project, so the schemas and field model
> are based on the Gemini2 profile (in turn based on the INSPIRE
> regulations). We are working in making this schemas more generic to
> support any ISO 19139 based document. Right now the changes in that
> sense are in specific branches that you will need to checkout on a
> couple of extensions (Sorry about the slightly different names):
>
> * ckanext-inspire: git checkout harvest-generic-iso
> * ckanext-csw: git checkout generic-iso-support
>
> This should get rid of the metadata-language field error. Let us know
> if there are further errors after this (Make sure to restart the
> gather and fetch consumers after checking out the branches)
>
> BTW we also plan on consolidate all this functionality in a single extension.
>
> Hope this helps,
>
> Adrià
>
>
>
>
> On 24 September 2012 20:57, Bruce Crevensten <becrevensten at alaska.edu> wrote:
>> Hi,
>>
>> I'm exploring using CKAN as a companion to GeoNetwork for presenting
>> geospatial climate data, and I'm having some difficulty getting CKAN
>> to harvest from GeoNetwork's CSW service. Since this thread contained
>> a note that was relevant to my situation (specifying the ISO19139
>> validator), I'm adding to this thread instead of starting a new one,
>> though my issue may be distinct from the original inquiry.
>>
>> I've installed the ckanext-harvest, ckanext-csw, and ckanext-inspire
>> extensions. I'm running CKAN 1.8 on a CentOS6 virtual machine, using
>> a source installation. GeoNetwork 2.6.4 is running on a different
>> CentOS6 machine. I've not explored the base CKAN install thoroughly,
>> but it appears to be stable.
>>
>> My configuration file (development.ini) has these settings:
>>
>> ckan.plugins = stats harvest ckan_harvester inspire_api
>> gemini_harvester gemini_doc_harvester gemini_waf_harvester
>> ckan.inspire.validator.profiles = iso19139
>>
>> My harvester job is set up to be type 'csw', and the URL endpoint is
>> this: http://athena.snap.uaf.edu:8080/geonetwork/srv/en/csw?request=GetRecordById&service=CSW&version=2.0.2&elementSetName=full&id=4edfbeef-f830-4ce7-b6b1-557592ea8dce
>>
>> (Side note: I'm a bit unclear if I'm using the correct URL endpoint.
>> That URL specifies a single data record, but the harvester appears to
>> correctly discover all of our data sets. ?)
>>
>> The error messages I'm getting seem to indicate that the fetching is
>> working OK, but the gemini profile is being used to validate the
>> results, causing validation errors and a failed harvest.
>>
>> Here's a log excerpt:
>>
>> 2012-09-24 12:35:53,076 INFO [ckanext.harvest.queue] Received harvest
>> object id: 2701000e-a931-4b57-9fa9-5209ef8be1e5
>> 2012-09-24 12:35:53,236 INFO [ckanext.csw.services] Making CSW
>> request: getrecordbyid [u'e3c2e8ea-0896-4011-b11b-f2f941fec941']
>> {'esn': 'full', 'outputschema': 'http://www.isotc211.org/2005/gmd'}
>> 2012-09-24 12:35:53,485 DEBUG [ckanext.inspire.harvesters] XML content
>> saved (len 24601)
>> 2012-09-24 12:35:53,492 ERROR [ckanext.inspire.harvesters] Traceback
>> (most recent call last):
>> File "/root/ckan/src/ckanext-inspire/ckanext/inspire/harvesters.py",
>> line 141, in import_stage
>> self.import_gemini_object(harvest_object.content)
>> File "/root/ckan/src/ckanext-inspire/ckanext/inspire/harvesters.py",
>> line 165, in import_gemini_object
>> package = self.write_package_from_gemini_string(unicode_gemini_string)
>> File "/root/ckan/src/ckanext-inspire/ckanext/inspire/harvesters.py",
>> line 174, in write_package_from_gemini_string
>> gemini_values = gemini_document.read_values()
>> File "/root/ckan/src/ckanext-inspire/ckanext/inspire/model/__init__.py",
>> line 19, in read_values
>> values[element.name] = element.read_value(tree)
>> File "/root/ckan/src/ckanext-inspire/ckanext/inspire/model/__init__.py",
>> line 51, in read_value
>> return self.fix_multiplicity(values)
>> File "/root/ckan/src/ckanext-inspire/ckanext/inspire/model/__init__.py",
>> line 102, in fix_multiplicity
>> "Value not found for element '%s'" % self.name)
>> Exception: Value not found for element 'metadata-language'
>> 2012-09-24 12:35:53,494 ERROR [ckanext.inspire.harvesters] Error
>> importing Gemini document: Value not found for element
>> 'metadata-language'
>>
>> Is my configuration to import ISO19139 records from GeoNetwork via CSW
>> correct, or is there another issue here?
>>
>> Thanks,
>> - Bruce
>>
>> On Fri, Sep 21, 2012 at 2:24 AM, David Read
>> <david.read at hackneyworkshop.com> wrote:
>>> Mauritzio,
>>>
>>> ckanext-harvest is just the harvesting framework and is useless on its
>>> own. The actual harvester for CSW is contained in ckanext-inspire, so
>>> you need to install that too.
>>>
>>> David
>>>
>>> On 13 September 2012 17:36, Maurizio Napolitano <napo at fbk.eu> wrote:
>>>> On 30/07/2012 12:04, Adrià Mercader wrote:
>>>>> Hi Simone,
>>>>>
>>>>> Glad to hear that you are using CKAN for geo-related stuff. We would
>>>>> love to hear any feedback that you may have.
>>>>>
>>>>> In relation to you problem, it looks like you have not loaded the CSW
>>>>> harvester extension(s) on your ini file. Can you double check that you
>>>>> have this added to your ini file?
>>>>>
>>>>> ckan.plugins = gemini_harvester <your other plugins...>
>>>>>
>>>>> Also make sure to add this line to your ini file to avoid validating
>>>>> the metadata records against the gemini profile (which is UK
>>>>> specific):
>>>>>
>>>>> ckan.inspire.validator.profiles = iso19139
>>>>>
>>>>> If you do have already defined the harvester in your ini file let me
>>>>> know it, as we will need to investigate a little further (try also
>>>>> restarting the consumers)
>>>>
>>>>
>>>> Hi Adria',
>>>> i used this configuration, and, if i go to
>>>> http://myckaninstallation/harvest
>>>> i can add a csw server but ... the answer is always
>>>>
>>>> Last Harvest Errors: 1
>>>> Gathering errors
>>>>
>>>> No harvester could be found for source type csw
>>>>
>>>> I tested it with some csw services like:
>>>> - http://www.pcn.minambiente.it/geoportal/csw
>>>> - http://datigis.comune.fi.it/geonetwork/srv/it/csw
>>>>
>>>> ... and in both cases i obtain this answer
>>>>
>>>> Where is my error?
>>>>
>>>> Thanks
>>>>
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> ckan-discuss mailing list
>>>> ckan-discuss at lists.okfn.org
>>>> http://lists.okfn.org/mailman/listinfo/ckan-discuss
>>> _______________________________________________
>>> ckan-discuss mailing list
>>> ckan-discuss at lists.okfn.org
>>> http://lists.okfn.org/mailman/listinfo/ckan-discuss
>>
>>
>>
>> --
>>
>> Bruce Crevensten, Web Programmer
>> Scenarios Network for Alaska & Arctic Planning
>> 3352 College Road, 2nd Floor Denali Building
>> Fairbanks, AK 99709
>> Phone: 907-474-7134
>> Fax: 907-474-7151
>> www.snap.uaf.edu
>> becrevensten at alaska.edu
>>
>> _______________________________________________
>> ckan-discuss mailing list
>> ckan-discuss at lists.okfn.org
>> http://lists.okfn.org/mailman/listinfo/ckan-discuss
> _______________________________________________
> ckan-discuss mailing list
> ckan-discuss at lists.okfn.org
> http://lists.okfn.org/mailman/listinfo/ckan-discuss
>
--
Angelos Tzotsos
Remote Sensing Laboratory
National Technical University of Athens
http://users.ntua.gr/tzotsos
More information about the ckan-discuss
mailing list