[ckan-dev] Need to call ckanext-harvest "run" twice?

Adrià Mercader adria.mercader at okfn.org
Tue Sep 10 09:51:50 UTC 2013


Hi Stefan,

Yes you are correct, to properly finish a harvest job you need to run
the "run" command twice.
This is due to the harvesting process being completely asynchronous,
as it generally involves big volumes of data.
The first time you run the "run" command the job will be just sent to
gather queue and the process will be finished. There are harvest
objects created for each remote document, which are sent to another
queue. Each object has a state property and when its import stage is
finished this is set to "complete" (or "error").
The only way to know if a job has actually finished is to check if all
the objects for a particular job have one of these states at some
point. The "run" command is a good place to do that, as most likely
you are going to set it up as cron job to run regularly so you can
check for finished jobs every X minutes.
When developing is slightly inconvenient because you have to remember
to run the "run" command again to mark the job as finished and be able
to create another one (and get the last job details indexed).

There are some scarce docs here, which we intend to improve soon:

https://github.com/okfn/ckanext-harvest#running-the-harvest-jobs

Hope this helps,

Adrià


On 9 September 2013 12:44, Stefan Oderbolz <stefan.oderbolz at liip.ch> wrote:
> Hi there,
>
> just wanted to know if this is a common practice or if we do something
> wrong:
> In all harvesters (custom ones based on ckanext-harvest as well as
> ckanext-harvest itself) we need to call the paster run command twice in
> order to mark the harvester as finished. At least only after running the
> "run" command for the second time, the job is marked as finished in the web
> interface.
>
> For me, this means I have to run this command always twice. Is that correct?
> If not, what do I have to do to mark a command as "finished"?
>
> Best Regards
> Stefan
>
> --
> Liip AG  //  Feldstrasse 133 //  CH-8004 Zurich
> Tel +41 43 500 39 80 // GnuPG 0x7B588C67 // www.liip.ch
>
> _______________________________________________
> ckan-dev mailing list
> ckan-dev at lists.okfn.org
> http://lists.okfn.org/mailman/listinfo/ckan-dev
> Unsubscribe: http://lists.okfn.org/mailman/options/ckan-dev
>




More information about the ckan-dev mailing list