[ckan-dev] harvesting ... Error 404 ... returns incorrect source URL as part of error

Adrià Mercader adria.mercader at okfn.org
Fri Nov 8 17:02:43 UTC 2013


Hi Colum,

The CKAN harvester is designed to harvest entire CKAN instances, not
individual datasets. When creating the harvest source you need to
define the CKAN root url (eg http://data.gov.uk).
At the moment it is not possible to restrict the remote datasets to be
harvested, so if you only need to import one of them I suggest using
the API to get it and create it on your local instance.

Adrià

On 4 November 2013 23:19, COLUM MCCOOLE <colum.mccoole at btinternet.com> wrote:
> I'm experimenting with the harvester and trying to harvest across this test
> data-set
> data.gov.uk/api/2/rest/package/collections-database
>
> I use the command line paster to create a job ... and then the sequence of
> gather_consumer, fetch_consumer and run.
>
> It fails on the gathering part ... and as part of the error message it
> returns a URL with '/api/2/rest/package' appended a second time to the end
> of the correct URL. I'm trying to figure out what that is happening and if
> that is ultimately the source of my error.
>
> Thanks,
> Colum
>
> 2013-11-04 22:10:09,274 DEBUG [ckanext.harvest.queue] Received harvest job
> id: 1ccbb0dc-3642-4870-becd-d80bb3815b6b
> 2013-11-04 22:10:09,305 DEBUG [ckanext.harvest.harvesters.ckanharvester] In
> CKANHarvester gather_stage
> (http://data.gov.uk/api/2/rest/package/collections-database)
> 2013-11-04 22:10:10,016 ERROR [ckanext.harvest.harvesters.base] Unable to
> get content for URL:
> http://data.gov.uk/api/2/rest/package/collections-database/api/2/rest/package:
> HTTP Error 404: Not Found
> 2013-11-04 22:10:10,022 ERROR [ckanext.harvest.queue] Gather stage failed
>
>
>
>
> _______________________________________________
> ckan-dev mailing list
> ckan-dev at lists.okfn.org
> http://lists.okfn.org/mailman/listinfo/ckan-dev
> Unsubscribe: http://lists.okfn.org/mailman/options/ckan-dev
>




More information about the ckan-dev mailing list