[ckan-discuss] Stuck with CSW Harvesting

Bruce Crevensten becrevensten at alaska.edu
Mon Sep 24 20:57:52 BST 2012


Hi,

I'm exploring using CKAN as a companion to GeoNetwork for presenting
geospatial climate data, and I'm having some difficulty getting CKAN
to harvest from GeoNetwork's CSW service.  Since this thread contained
a note that was relevant to my situation (specifying the ISO19139
validator), I'm adding to this thread instead of starting a new one,
though my issue may be distinct from the original inquiry.

I've installed the ckanext-harvest, ckanext-csw, and ckanext-inspire
extensions.  I'm running CKAN 1.8 on a CentOS6 virtual machine, using
a source installation.  GeoNetwork 2.6.4 is running on a different
CentOS6 machine.  I've not explored the base CKAN install thoroughly,
but it appears to be stable.

My configuration file (development.ini) has these settings:

ckan.plugins = stats harvest ckan_harvester inspire_api
gemini_harvester gemini_doc_harvester gemini_waf_harvester
ckan.inspire.validator.profiles = iso19139

My harvester job is set up to be type 'csw', and the URL endpoint is
this: http://athena.snap.uaf.edu:8080/geonetwork/srv/en/csw?request=GetRecordById&service=CSW&version=2.0.2&elementSetName=full&id=4edfbeef-f830-4ce7-b6b1-557592ea8dce

(Side note: I'm a bit unclear if I'm using the correct URL endpoint.
That URL specifies a single data record, but the harvester appears to
correctly discover all of our data sets.  ?)

The error messages I'm getting seem to indicate that the fetching is
working OK, but the gemini profile is being used to validate the
results, causing validation errors and a failed harvest.

Here's a log excerpt:

2012-09-24 12:35:53,076 INFO  [ckanext.harvest.queue] Received harvest
object id: 2701000e-a931-4b57-9fa9-5209ef8be1e5
2012-09-24 12:35:53,236 INFO  [ckanext.csw.services] Making CSW
request: getrecordbyid [u'e3c2e8ea-0896-4011-b11b-f2f941fec941']
{'esn': 'full', 'outputschema': 'http://www.isotc211.org/2005/gmd'}
2012-09-24 12:35:53,485 DEBUG [ckanext.inspire.harvesters] XML content
saved (len 24601)
2012-09-24 12:35:53,492 ERROR [ckanext.inspire.harvesters] Traceback
(most recent call last):
  File "/root/ckan/src/ckanext-inspire/ckanext/inspire/harvesters.py",
line 141, in import_stage
    self.import_gemini_object(harvest_object.content)
  File "/root/ckan/src/ckanext-inspire/ckanext/inspire/harvesters.py",
line 165, in import_gemini_object
    package = self.write_package_from_gemini_string(unicode_gemini_string)
  File "/root/ckan/src/ckanext-inspire/ckanext/inspire/harvesters.py",
line 174, in write_package_from_gemini_string
    gemini_values = gemini_document.read_values()
  File "/root/ckan/src/ckanext-inspire/ckanext/inspire/model/__init__.py",
line 19, in read_values
    values[element.name] = element.read_value(tree)
  File "/root/ckan/src/ckanext-inspire/ckanext/inspire/model/__init__.py",
line 51, in read_value
    return self.fix_multiplicity(values)
  File "/root/ckan/src/ckanext-inspire/ckanext/inspire/model/__init__.py",
line 102, in fix_multiplicity
    "Value not found for element '%s'" % self.name)
Exception: Value not found for element 'metadata-language'
2012-09-24 12:35:53,494 ERROR [ckanext.inspire.harvesters] Error
importing Gemini document: Value not found for element
'metadata-language'

Is my configuration to import ISO19139 records from GeoNetwork via CSW
correct, or is there another issue here?

Thanks,
- Bruce

On Fri, Sep 21, 2012 at 2:24 AM, David Read
<david.read at hackneyworkshop.com> wrote:
>
> Mauritzio,
>
> ckanext-harvest is just the harvesting framework and is useless on its
> own. The actual harvester for CSW is contained in ckanext-inspire, so
> you need to install that too.
>
> David
>
> On 13 September 2012 17:36, Maurizio Napolitano <napo at fbk.eu> wrote:
> > On 30/07/2012 12:04, Adrià Mercader wrote:
> >>
> >> Hi Simone,
> >>
> >> Glad to hear that you are using CKAN for geo-related stuff. We would
> >> love to hear any feedback that you may have.
> >>
> >> In relation to you problem, it looks like you have not loaded the CSW
> >> harvester extension(s) on your ini file. Can you double check that you
> >> have this added to your ini file?
> >>
> >> ckan.plugins = gemini_harvester <your other plugins...>
> >>
> >> Also make sure to add this line to your ini file to avoid validating
> >> the metadata records against the gemini profile (which is UK
> >> specific):
> >>
> >> ckan.inspire.validator.profiles = iso19139
> >>
> >> If you do have already defined the harvester in your ini file let me
> >> know it, as we will need to investigate a little further (try also
> >> restarting the consumers)
> >
> >
> >
> > Hi Adria',
> > i used this configuration, and, if i go to
> > http://myckaninstallation/harvest
> > i can add a csw server but ... the answer is always
> >
> > Last Harvest Errors: 1
> > Gathering errors
> >
> >     No harvester could be found for source type csw
> >
> > I tested it with some csw services like:
> > - http://www.pcn.minambiente.it/geoportal/csw
> > - http://datigis.comune.fi.it/geonetwork/srv/it/csw
> >
> > ... and in both cases i obtain this answer
> >
> > Where is my error?
> >
> > Thanks
> >
> >
> >
> >
> > _______________________________________________
> > ckan-discuss mailing list
> > ckan-discuss at lists.okfn.org
> > http://lists.okfn.org/mailman/listinfo/ckan-discuss
>
> _______________________________________________
> ckan-discuss mailing list
> ckan-discuss at lists.okfn.org
> http://lists.okfn.org/mailman/listinfo/ckan-discuss




--

Bruce Crevensten, Web Programmer
Scenarios Network for Alaska & Arctic Planning
3352 College Road, 2nd Floor Denali Building
Fairbanks, AK 99709
Phone: 907-474-7134
Fax: 907-474-7151
www.snap.uaf.edu
becrevensten at alaska.edu



More information about the ckan-discuss mailing list