[ckan-discuss] Stuck with CSW Harvesting

Adrià Mercader amercadero at gmail.com
Tue Sep 25 10:29:07 BST 2012


Hi Bruce,

I'm glad to hear that you are exploring using CKAN alongside CSW, this
is something we want to improve and any feedback on this is greatly
appreciated.

As David already mentioned, the CSW related extensions were written in
the context of the UK Location Project, so the schemas and field model
are based on the Gemini2 profile (in turn based on the INSPIRE
regulations). We are working in making this schemas more generic to
support any ISO 19139 based document. Right now the changes in that
sense are in specific branches that you will need to checkout on a
couple of extensions (Sorry about the slightly different names):

* ckanext-inspire: git checkout harvest-generic-iso
* ckanext-csw: git checkout generic-iso-support

This should get rid of the metadata-language field error. Let us know
if there are further errors after this (Make sure to restart the
gather and fetch consumers after checking out the branches)

BTW we also plan on consolidate all this functionality in a single extension.

Hope this helps,

Adrià




On 24 September 2012 20:57, Bruce Crevensten <becrevensten at alaska.edu> wrote:
> Hi,
>
> I'm exploring using CKAN as a companion to GeoNetwork for presenting
> geospatial climate data, and I'm having some difficulty getting CKAN
> to harvest from GeoNetwork's CSW service.  Since this thread contained
> a note that was relevant to my situation (specifying the ISO19139
> validator), I'm adding to this thread instead of starting a new one,
> though my issue may be distinct from the original inquiry.
>
> I've installed the ckanext-harvest, ckanext-csw, and ckanext-inspire
> extensions.  I'm running CKAN 1.8 on a CentOS6 virtual machine, using
> a source installation.  GeoNetwork 2.6.4 is running on a different
> CentOS6 machine.  I've not explored the base CKAN install thoroughly,
> but it appears to be stable.
>
> My configuration file (development.ini) has these settings:
>
> ckan.plugins = stats harvest ckan_harvester inspire_api
> gemini_harvester gemini_doc_harvester gemini_waf_harvester
> ckan.inspire.validator.profiles = iso19139
>
> My harvester job is set up to be type 'csw', and the URL endpoint is
> this: http://athena.snap.uaf.edu:8080/geonetwork/srv/en/csw?request=GetRecordById&service=CSW&version=2.0.2&elementSetName=full&id=4edfbeef-f830-4ce7-b6b1-557592ea8dce
>
> (Side note: I'm a bit unclear if I'm using the correct URL endpoint.
> That URL specifies a single data record, but the harvester appears to
> correctly discover all of our data sets.  ?)
>
> The error messages I'm getting seem to indicate that the fetching is
> working OK, but the gemini profile is being used to validate the
> results, causing validation errors and a failed harvest.
>
> Here's a log excerpt:
>
> 2012-09-24 12:35:53,076 INFO  [ckanext.harvest.queue] Received harvest
> object id: 2701000e-a931-4b57-9fa9-5209ef8be1e5
> 2012-09-24 12:35:53,236 INFO  [ckanext.csw.services] Making CSW
> request: getrecordbyid [u'e3c2e8ea-0896-4011-b11b-f2f941fec941']
> {'esn': 'full', 'outputschema': 'http://www.isotc211.org/2005/gmd'}
> 2012-09-24 12:35:53,485 DEBUG [ckanext.inspire.harvesters] XML content
> saved (len 24601)
> 2012-09-24 12:35:53,492 ERROR [ckanext.inspire.harvesters] Traceback
> (most recent call last):
>   File "/root/ckan/src/ckanext-inspire/ckanext/inspire/harvesters.py",
> line 141, in import_stage
>     self.import_gemini_object(harvest_object.content)
>   File "/root/ckan/src/ckanext-inspire/ckanext/inspire/harvesters.py",
> line 165, in import_gemini_object
>     package = self.write_package_from_gemini_string(unicode_gemini_string)
>   File "/root/ckan/src/ckanext-inspire/ckanext/inspire/harvesters.py",
> line 174, in write_package_from_gemini_string
>     gemini_values = gemini_document.read_values()
>   File "/root/ckan/src/ckanext-inspire/ckanext/inspire/model/__init__.py",
> line 19, in read_values
>     values[element.name] = element.read_value(tree)
>   File "/root/ckan/src/ckanext-inspire/ckanext/inspire/model/__init__.py",
> line 51, in read_value
>     return self.fix_multiplicity(values)
>   File "/root/ckan/src/ckanext-inspire/ckanext/inspire/model/__init__.py",
> line 102, in fix_multiplicity
>     "Value not found for element '%s'" % self.name)
> Exception: Value not found for element 'metadata-language'
> 2012-09-24 12:35:53,494 ERROR [ckanext.inspire.harvesters] Error
> importing Gemini document: Value not found for element
> 'metadata-language'
>
> Is my configuration to import ISO19139 records from GeoNetwork via CSW
> correct, or is there another issue here?
>
> Thanks,
> - Bruce
>
> On Fri, Sep 21, 2012 at 2:24 AM, David Read
> <david.read at hackneyworkshop.com> wrote:
>>
>> Mauritzio,
>>
>> ckanext-harvest is just the harvesting framework and is useless on its
>> own. The actual harvester for CSW is contained in ckanext-inspire, so
>> you need to install that too.
>>
>> David
>>
>> On 13 September 2012 17:36, Maurizio Napolitano <napo at fbk.eu> wrote:
>> > On 30/07/2012 12:04, Adrià Mercader wrote:
>> >>
>> >> Hi Simone,
>> >>
>> >> Glad to hear that you are using CKAN for geo-related stuff. We would
>> >> love to hear any feedback that you may have.
>> >>
>> >> In relation to you problem, it looks like you have not loaded the CSW
>> >> harvester extension(s) on your ini file. Can you double check that you
>> >> have this added to your ini file?
>> >>
>> >> ckan.plugins = gemini_harvester <your other plugins...>
>> >>
>> >> Also make sure to add this line to your ini file to avoid validating
>> >> the metadata records against the gemini profile (which is UK
>> >> specific):
>> >>
>> >> ckan.inspire.validator.profiles = iso19139
>> >>
>> >> If you do have already defined the harvester in your ini file let me
>> >> know it, as we will need to investigate a little further (try also
>> >> restarting the consumers)
>> >
>> >
>> >
>> > Hi Adria',
>> > i used this configuration, and, if i go to
>> > http://myckaninstallation/harvest
>> > i can add a csw server but ... the answer is always
>> >
>> > Last Harvest Errors: 1
>> > Gathering errors
>> >
>> >     No harvester could be found for source type csw
>> >
>> > I tested it with some csw services like:
>> > - http://www.pcn.minambiente.it/geoportal/csw
>> > - http://datigis.comune.fi.it/geonetwork/srv/it/csw
>> >
>> > ... and in both cases i obtain this answer
>> >
>> > Where is my error?
>> >
>> > Thanks
>> >
>> >
>> >
>> >
>> > _______________________________________________
>> > ckan-discuss mailing list
>> > ckan-discuss at lists.okfn.org
>> > http://lists.okfn.org/mailman/listinfo/ckan-discuss
>>
>> _______________________________________________
>> ckan-discuss mailing list
>> ckan-discuss at lists.okfn.org
>> http://lists.okfn.org/mailman/listinfo/ckan-discuss
>
>
>
>
> --
>
> Bruce Crevensten, Web Programmer
> Scenarios Network for Alaska & Arctic Planning
> 3352 College Road, 2nd Floor Denali Building
> Fairbanks, AK 99709
> Phone: 907-474-7134
> Fax: 907-474-7151
> www.snap.uaf.edu
> becrevensten at alaska.edu
>
> _______________________________________________
> ckan-discuss mailing list
> ckan-discuss at lists.okfn.org
> http://lists.okfn.org/mailman/listinfo/ckan-discuss



More information about the ckan-discuss mailing list