[ckan-dev] Datastore and linked-to resources that no longer exist

Sean Hammond sean.hammond at okfn.org
Mon Mar 25 11:16:37 UTC 2013


Hey all,

Take a look at this dataset on publicdata.eu:

http://publicdata.eu/dataset/ministerial-data-cabinet-office

If you click on any of the resources you'll get an error:

Could not load preview: DataProxy returned an error (Data transformation
failed. HTTPError: HTTP Error 404: Not Found)

and if you try to download any of the resource files from the source
site you'll find they no longer exist, eg:

http://www.cabinetoffice.gov.uk/sites/default/files/resources/pm-meetings.csv

Related to the new work that's being done around the new datastorer
service (data pusher is its current name I think) and the new datastorer
paster command/cron job:

I'm not sure how we intend to deal with this problem in CKAN -- when a
resource file is linked to, and then the source file on the remote site
moves or disappears. Once we have the datastorer service and script
stuff sorted out, then it can be deployed and a resource file like this
would have been pulled into the datastore so could be previewed from the
datastore. But what should the datastorer do, when it finds that the
original source file is gone? Should it leave the data in the datastore,
so that preview and data API keep working? Or should it delete the data
in the datastore, and have the resource page display some clear error
message that says the source file is no longer there?




More information about the ckan-dev mailing list