[ckan-dev] Datastore and linked-to resources that no longer exist

Darwin Peltan darwin.peltan at okfn.org
Fri Apr 26 17:33:14 UTC 2013


I'd be +1 for removing the dataproxy and prompting user to download rather
than the current behaviour where the user see's a failed attempt to
preview. Is there any reason to keep the dataproxy?

Darwin Peltan
Project Manager

The Open Knowledge Foundation
http://www.okfn.org

Skype: darwinp
Twitter: @darwin


On 26 April 2013 08:47, Sean Hammond <sean.hammond at okfn.org> wrote:

> Good thoughts Nigel. Actually, I wonder if it would be worth adding this
> to this wiki page?
>
>
> https://github.com/okfn/ckan/wiki/Spec:-DataStore-and-FileStore-Consolidation
>
> On Thu, Apr 18, 2013 at 07:58:50PM +0530, Nigel Babu wrote:
> > Ah, this is also connected to the dataproxy discussions.
> >
> > Currently, there's no way for CKAN to know that something was attempted
> to
> > be loaded into the datastore and it failed. Almost all the failures I've
> > noticed in production are because of the file, i.e. not a CKAN/datastore
> > issue.  When a user attempts to preview the resource, there will be
> nothing
> > in the datastore for this resource and CKAN attempt to use the dataproxy
> > and fails.
> >
> > This gives us two options
> > 1) Remove dataproxy. If a file isn't in the datastore, there's something
> > wrong with it and it can't be loaded. Offer a download link instead.
> > 2) Have a way to mark that a file was attempted to be loaded into the
> > datastore and it failed. If the file isn't in the datastore and this
> > failure is marked, dataproxy should not attempt to preview and only
> offer a
> > download link.
> >
> > The datastore_upload[1] script will ignore the resource if the download
> of
> > the resource fails. That means if there was an existing entry in the
> > datastore, it will continue to remain in the datastore. If a file was
> > updated, the datastore will be updated on the next run.  The delete
> > behaviour may not be entirely appropriate and should probably discussed
> > further.
> >
> > [1]
> >
> https://github.com/okfn/ckanext-datastorer/blob/master/ckanext/datastorer/commands.py#L193
> >
> > Nigel.
> >
> >
> >
> > On 18 April 2013 19:07, Sean Hammond <sean.hammond at okfn.org> wrote:
> >
> > > > Take a look at this dataset on publicdata.eu:
> > > >
> > > > http://publicdata.eu/dataset/ministerial-data-cabinet-office
> > > >
> > > > If you click on any of the resources you'll get an error:
> > > >
> > > > Could not load preview: DataProxy returned an error (Data
> transformation
> > > > failed. HTTPError: HTTP Error 404: Not Found)
> > > >
> > > > and if you try to download any of the resource files from the source
> > > > site you'll find they no longer exist, eg:
> > > >
> > > >
> > >
> http://www.cabinetoffice.gov.uk/sites/default/files/resources/pm-meetings.csv
> > > >
> > > > Related to the new work that's being done around the new datastorer
> > > > service (data pusher is its current name I think) and the new
> datastorer
> > > > paster command/cron job:
> > > >
> > > > I'm not sure how we intend to deal with this problem in CKAN -- when
> a
> > > > resource file is linked to, and then the source file on the remote
> site
> > > > moves or disappears. Once we have the datastorer service and script
> > > > stuff sorted out, then it can be deployed and a resource file like
> this
> > > > would have been pulled into the datastore so could be previewed from
> the
> > > > datastore. But what should the datastorer do, when it finds that the
> > > > original source file is gone? Should it leave the data in the
> datastore,
> > > > so that preview and data API keep working? Or should it delete the
> data
> > > > in the datastore, and have the resource page display some clear error
> > > > message that says the source file is no longer there?
> > >
> > > Ping. This seems related to the datapusher discussion we had this
> > > morning
> > >
> > > _______________________________________________
> > > ckan-dev mailing list
> > > ckan-dev at lists.okfn.org
> > > http://lists.okfn.org/mailman/listinfo/ckan-dev
> > > Unsubscribe: http://lists.okfn.org/mailman/options/ckan-dev
> > >
>
> > _______________________________________________
> > ckan-dev mailing list
> > ckan-dev at lists.okfn.org
> > http://lists.okfn.org/mailman/listinfo/ckan-dev
> > Unsubscribe: http://lists.okfn.org/mailman/options/ckan-dev
>
>
> _______________________________________________
> ckan-dev mailing list
> ckan-dev at lists.okfn.org
> http://lists.okfn.org/mailman/listinfo/ckan-dev
> Unsubscribe: http://lists.okfn.org/mailman/options/ckan-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.okfn.org/pipermail/ckan-dev/attachments/20130426/fc308a0c/attachment-0001.html>


More information about the ckan-dev mailing list