[ckan-dev] CKAN harvester harvests datasets with type "harvest"

Stefan Oderbolz stefan.oderbolz at liip.ch
Wed Aug 14 11:10:31 UTC 2013


Hi there,

On my CKAN instance I was unable to reach the /harvest page to manage the
harvest sources. I ran across a strange error in the logfile:

[error] [client XX.XX.XX.XX] Error - <class
'jinja2.exceptions.UndefinedError'>: 'dict object' has no attribute 'status'

When I investigated the templates of the ckanext-harvest extension, I found
some occurances of "status" and I was finally able to find the ones causing
this exception. I added checks to the templates to make them robust against
this it, here is my pull request:
https://github.com/okfn/ckanext-harvest/pull/60

When I was finally able to see the harvest page again, I saw what likely
caused the error: I run the harvester against another CKAN instance. This
other instance has itself some customer harvesters of mine to load data.
Now it seems, that the CKAN-CKAN harvester also harvests the harvester
sources (instead of ignoring them as I expected). These "new" harvest
sources took down the harvest page because they didn't have a status,
because they actually only exist on my 2nd instance.

I think this is a bug, or am I missing a flag to configure whether or not
to harvest the "harvest" datasets?

If someone can point me to the right direction, I willing to provide a pull
request to fix it. My guess would be, that the API should not return
harvest sources or that we should introduce a new configuration option for
this.

Best Regards
Stefan

-- 
Liip AG  //  Feldstrasse 133 //  CH-8004 Zurich
Tel +41 43 500 39 80 // GnuPG 0x7B588C67 // www.liip.ch
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.okfn.org/pipermail/ckan-dev/attachments/20130814/ea8f6ae5/attachment.html>


More information about the ckan-dev mailing list