[ckan-dev] CKAN harvester harvests datasets with type "harvest"

Stefan Oderbolz stefan.oderbolz at liip.ch
Thu Aug 15 15:07:59 UTC 2013


Hi Adrià,

thanks a lot for your pull request and for merging mine :)
I see that there is a lot of work to be done and I try to make it better.

We currently rely heavily on the ckanext-harvester and there are a lot of
missing features that we need to implement (like support for
term_translations or harvesting of organizations).
I'm happy to provide my fixes/new features and I'm willing to help with the
refactoring.

Regards Stefan


On Thu, Aug 15, 2013 at 3:59 PM, Adrià Mercader <adria.mercader at okfn.org>wrote:

> Hi Stefan,
>
> You are right, this should be handled somehow. I'n not keen in
> modifying the API to not return them, as this has wider implications
> (and they are datasets after all, so you still want them on the API).
> I think that the easiest way to fix for the time being is ignoring
> harvest sources on the CKAN harvester, I added a small patch that does
> this [1].
> The whole CKAN harvester is a bit dated and would need a bit of a refactor.
>
> Thanks for your other PR, I also merged it.
>
> Adrià
>
>
> [1] https://github.com/okfn/ckanext-harvest/commit/01ca5c0d
>
> On 14 August 2013 12:10, Stefan Oderbolz <stefan.oderbolz at liip.ch> wrote:
> > Hi there,
> >
> > On my CKAN instance I was unable to reach the /harvest page to manage the
> > harvest sources. I ran across a strange error in the logfile:
> >
> > [error] [client XX.XX.XX.XX] Error - <class
> > 'jinja2.exceptions.UndefinedError'>: 'dict object' has no attribute
> 'status'
> >
> > When I investigated the templates of the ckanext-harvest extension, I
> found
> > some occurances of "status" and I was finally able to find the ones
> causing
> > this exception. I added checks to the templates to make them robust
> against
> > this it, here is my pull request:
> > https://github.com/okfn/ckanext-harvest/pull/60
> >
> > When I was finally able to see the harvest page again, I saw what likely
> > caused the error: I run the harvester against another CKAN instance. This
> > other instance has itself some customer harvesters of mine to load data.
> > Now it seems, that the CKAN-CKAN harvester also harvests the harvester
> > sources (instead of ignoring them as I expected). These "new" harvest
> > sources took down the harvest page because they didn't have a status,
> > because they actually only exist on my 2nd instance.
> >
> > I think this is a bug, or am I missing a flag to configure whether or
> not to
> > harvest the "harvest" datasets?
> >
> > If someone can point me to the right direction, I willing to provide a
> pull
> > request to fix it. My guess would be, that the API should not return
> harvest
> > sources or that we should introduce a new configuration option for this.
> >
> > Best Regards
> > Stefan
> >
> > --
> > Liip AG  //  Feldstrasse 133 //  CH-8004 Zurich
> > Tel +41 43 500 39 80 // GnuPG 0x7B588C67 // www.liip.ch
> >
> > _______________________________________________
> > ckan-dev mailing list
> > ckan-dev at lists.okfn.org
> > http://lists.okfn.org/mailman/listinfo/ckan-dev
> > Unsubscribe: http://lists.okfn.org/mailman/options/ckan-dev
> >
>
> _______________________________________________
> ckan-dev mailing list
> ckan-dev at lists.okfn.org
> http://lists.okfn.org/mailman/listinfo/ckan-dev
> Unsubscribe: http://lists.okfn.org/mailman/options/ckan-dev
>



-- 
Liip AG  //  Feldstrasse 133 //  CH-8004 Zurich
Tel +41 43 500 39 80 // GnuPG 0x7B588C67 // www.liip.ch
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.okfn.org/pipermail/ckan-dev/attachments/20130815/19fa0979/attachment-0001.html>


More information about the ckan-dev mailing list