[ckan-dev] crash while harvesting
Hildegard Gerlach
hildegard.gerlach at jrc.ec.europa.eu
Fri Jan 10 11:37:10 UTC 2014
Dear all,
I get the following error harvesting from a GeoNetwork instance using
ckan.spatial.validator.profiles = iso19139eden and
ckanext.spatial.harvest.continue_on_validation_errors = true
Traceback (most recent call last):
File "/usr/local/ckan/pyenv/bin/paster", line 9, in <module>
load_entry_point('PasteScript==1.7.5', 'console_scripts', 'paster')()
File
"/usr/local/ckan/pyenv/lib/python2.6/site-packages/paste/script/command.py",
line 104, in run
invoke(command, command_name, options, args[1:])
File
"/usr/local/ckan/pyenv/lib/python2.6/site-packages/paste/script/command.py",
line 143, in invoke
exit_code = runner.run(args)
File
"/usr/local/ckan/pyenv/lib/python2.6/site-packages/paste/script/command.py",
line 238, in run
result = self.command()
File
"/usr/local/ckan/pyenv/src/ckanext-harvest/ckanext/harvest/commands/harvester.py",
line 127, in command
fetch_callback(consumer, method, header, body)
File
"/usr/local/ckan/pyenv/src/ckanext-harvest/ckanext/harvest/queue.py",
line 294, in fetch_callback
fetch_and_import_stages(harvester, obj)
File
"/usr/local/ckan/pyenv/src/ckanext-harvest/ckanext/harvest/queue.py",
line 311, in fetch_and_import_stages
success_import = harvester.import_stage(obj)
File
"/usr/local/ckan/pyenv/src/ckanext-spatial/ckanext/spatial/harvesters/base.py",
line 441, in import_stage
is_valid, profile, errors =
self._validate_document(harvest_object.content, harvest_object)
File
"/usr/local/ckan/pyenv/src/ckanext-spatial/ckanext/spatial/harvesters/base.py",
line 726, in _validate_document
valid, profile, errors = validator.is_valid(xml)
File
"/usr/local/ckan/pyenv/src/ckanext-spatial/ckanext/spatial/validation/validation.py",
line 335, in is_valid
is_valid, error_message_list = validator.is_valid(xml)
File
"/usr/local/ckan/pyenv/src/ckanext-spatial/ckanext/spatial/validation/validation.py",
line 87, in is_valid
metadata_type = cls.get_record_type(xml)
File
"/usr/local/ckan/pyenv/src/ckanext-spatial/ckanext/spatial/validation/validation.py",
line 123, in get_record_type
return iso_parser.read_value('resource-type')[0]
IndexError: list index out of range
The fetch_consumer stage crashes and the harvesting job doesn't finish,
it is always running (46 datasets found)
If I put ckan.spatial.validator.profiles = iso19139ngdc
the harvesting job finishes and finds 57 datasets
seems to me like a bug
Another question: Is there a way to harvest only datasets corresponding
to a certain criteria ?
I cannot see anything similar for the configuration field defining the
harvester.
Thanks a lot
Hilde
More information about the ckan-dev
mailing list