[ckan-dev] Middleware-components

Adrià Mercader adria.mercader at okfn.org
Mon Oct 6 09:03:37 UTC 2014


On 6 October 2014 07:43, Henrik Aagaard Jørgensen <BU1G at tmf.kk.dk> wrote:
> It seems as the DataPusher is only designed for already uploaded files that exists in CKAN?


IIRC the current implementation of the DataPusher works against
existing resources, it doesn't matter if they point to an uploaded
file or an external URL.
This is the first half of what DataPusher do, retrieving the data from
the remote file, and the one you don't care about because you'd need
to replace it with your logic to access different data sources.

What can be useful is the second half, which is processing this data
with messytables and push it to the DataStore. You could even skip the
messytables bit if you know that the data you are importing is well
structured.

>From this line onwards is a good point to start:

https://github.com/ckan/datapusher/blob/master/datapusher/jobs.py#L267

As you mention this could be a really useful functionality for users,
would be great to see it move forward.

Hope this helps,

Adrià



More information about the ckan-dev mailing list