[ckan-dev] crash while harvesting

Hildegard Gerlach hildegard.gerlach at jrc.ec.europa.eu
Fri Jan 10 11:37:10 UTC 2014


Dear all,

I get the following error harvesting from a GeoNetwork instance using

ckan.spatial.validator.profiles = iso19139eden and
ckanext.spatial.harvest.continue_on_validation_errors = true

Traceback (most recent call last):
   File "/usr/local/ckan/pyenv/bin/paster", line 9, in <module>
     load_entry_point('PasteScript==1.7.5', 'console_scripts', 'paster')()
   File 
"/usr/local/ckan/pyenv/lib/python2.6/site-packages/paste/script/command.py", 
line 104, in run
     invoke(command, command_name, options, args[1:])
   File 
"/usr/local/ckan/pyenv/lib/python2.6/site-packages/paste/script/command.py", 
line 143, in invoke
     exit_code = runner.run(args)
   File 
"/usr/local/ckan/pyenv/lib/python2.6/site-packages/paste/script/command.py", 
line 238, in run
     result = self.command()
   File 
"/usr/local/ckan/pyenv/src/ckanext-harvest/ckanext/harvest/commands/harvester.py", 
line 127, in command
     fetch_callback(consumer, method, header, body)
   File 
"/usr/local/ckan/pyenv/src/ckanext-harvest/ckanext/harvest/queue.py", 
line 294, in fetch_callback
     fetch_and_import_stages(harvester, obj)
   File 
"/usr/local/ckan/pyenv/src/ckanext-harvest/ckanext/harvest/queue.py", 
line 311, in fetch_and_import_stages
     success_import = harvester.import_stage(obj)
   File 
"/usr/local/ckan/pyenv/src/ckanext-spatial/ckanext/spatial/harvesters/base.py", 
line 441, in import_stage
     is_valid, profile, errors = 
self._validate_document(harvest_object.content, harvest_object)
   File 
"/usr/local/ckan/pyenv/src/ckanext-spatial/ckanext/spatial/harvesters/base.py", 
line 726, in _validate_document
     valid, profile, errors = validator.is_valid(xml)
   File 
"/usr/local/ckan/pyenv/src/ckanext-spatial/ckanext/spatial/validation/validation.py", 
line 335, in is_valid
     is_valid, error_message_list = validator.is_valid(xml)
   File 
"/usr/local/ckan/pyenv/src/ckanext-spatial/ckanext/spatial/validation/validation.py", 
line 87, in is_valid
     metadata_type = cls.get_record_type(xml)
   File 
"/usr/local/ckan/pyenv/src/ckanext-spatial/ckanext/spatial/validation/validation.py", 
line 123, in get_record_type
     return iso_parser.read_value('resource-type')[0]
IndexError: list index out of range

The fetch_consumer stage crashes and the harvesting job doesn't finish, 
it is always running (46 datasets found)

If I put ckan.spatial.validator.profiles = iso19139ngdc
the harvesting job finishes and finds 57 datasets

seems to me like a bug


Another question: Is there a way to harvest only datasets corresponding 
to a certain criteria ?
I cannot see anything similar for the configuration field defining the 
harvester.

Thanks a lot

Hilde





More information about the ckan-dev mailing list