[openbiblio-dev] First cut of AsyncUpload branch

Fri Feb 10 10:54:24 UTC 2012

On 10 February 2012 09:59, Etienne Posthumus <etienne.posthumus at okfn.org> wrote:
> On 10 February 2012 10:05, Rufus Pollock <rufus.pollock at okfn.org> wrote:
>> All sounds great. Just to say, in case you are not aware of it, there
>> is a standard python library for doing async tasks called celery
>
> Yup, I have used celery on other projects.
> It is great, but I felt overkill for what we need right now.

Understood -- and it's a tension we've seen on other projects (though
eventually you always seem to end up using -- or reinventing celery).

>> Celery provides access to its task queue, there's also stuff like:
>> <https://github.com/ask/celerymon>
>
> Yes, but that is fairly generic. As soon as you want application
> specific things, like: "How far is my download?" or "What went wrong
> with the parse?" you have to customise it anyway.

Agreed.

>>> - Deciding how to make the ingest pipeline a long-running process.
>>> (simple while True: loop?, some form of messaging? polling?)
>>
>> Celery runs a daemon that takes care of this.
>
> Yep, but you also have to install/maintain a message queue. So more
> components/complexity.
> (I prefer Redis in combintaion with Celery, even though it is not a
> queue per se, but it shines in ease of use, performance and
> versatility)

Understood, though with RDBMS you can just use that. We are already
running this kind of stuff for other projects.

Rufus