[ckan-dev] import.io

Matthew Fullerton matt.fullerton at gmail.com
Thu Oct 13 09:22:28 UTC 2016


Hi Oliver,
The basic extension to be aware of is https://github.com/ckan/
ckanext-harvest - with it you can write scripts that grab data (e.g. a
scraper) and push it into CKAN and have it automatically take place on a
regular basis. But you don't want to write scripts, you want to use
import.io.

I don't know of any connection between those two. Import.io is cool but a
very closed system. I recently submitted a small grant proposal to build an
extension to connect https://morph.io with CKAN. morph.io is free and open
source and gives you the ground work for the scraping as well as taking
care of the scheduling. All that is necessary is either an extension to
morph.io ("publish to CKAN") or a hook in CKAN (morph supports triggering a
URL every time a scrape completes) that pulls the latest data from morph.io.
The full text of the proposal is in German but that's the gist of it. Its
something I'd be very interested in working on even if it doesn't get
funded as I am taking care of this right now with Amazon Web Services
Lambda functions which all feels too manual and scattered.

Back to import.io, maybe they also let you call some code or a URL after a
scrape? Writing new data into CKAN programmatically (either new datasets or
rows of data in an existing resource) is quite easy with CKAN.

Best,
Matt



On 12 October 2016 at 14:46, Oliver Standeven <os214 at kitc-solutions.co.uk>
wrote:

> Hello all,
>
> I am a University student who has worked with CKAN at my previous place of
> employment (on my placement) and I have suggested CKAN as an option for one
> of the projects that I am now working on in my final year.
>
> The client wants to use import.io to scrape some information from
> websites to list them all in one place. I wondered if anybody has had
> experience in using import.io and if there is maybe some CKAN extensions
> that may be able to get me started with some proof of concept or would I
> have to do that manually myself? I think CKAN would be great for
> categorizing the data and making it openly available.
>
> Thanks in advance,
>
> Oliver
>
> _______________________________________________
> ckan-dev mailing list
> ckan-dev at lists.okfn.org
> https://lists.okfn.org/mailman/listinfo/ckan-dev
> Unsubscribe: https://lists.okfn.org/mailman/options/ckan-dev
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.okfn.org/pipermail/ckan-dev/attachments/20161013/f1722722/attachment-0003.html>


More information about the ckan-dev mailing list