[ckan-dev] Datastore and linked-to resources that no longer exist

Nigel Babu nigel.babu at okfn.org
Sun Apr 28 21:18:45 UTC 2013


Hi Ross,

We're talking about removing the dataproxy extension from core CKAN, if
that wasn't clear. We'll still have it as a non-core extension.  The
dataproxy instance will not be affected.


On 26 April 2013 23:59, Ross Jones <ross at servercode.co.uk> wrote:

> Hi Darwin,
>
> I understand you're talking about removing the _use_ of the
> jsonpdataproxy, which sounds sensible, but removing jsonpdataproxy itself
> might cause issues (although no longer for data.gov.uk).  Of course you
> could always fix the dataproxy as it has value as a stand-alone thing
> (unless there's a huge objection or cost to using appengine) ;).
>
> Prior to 2.0 the application.js shipping with 1.x was using
> jsonpdataproxy.appspot.com, so removing the app from appspot is likely to
> break other CKAN installations.  Of course it's a free service, but it
> might be best if there was an alternative for users who aren't yet ready to
> upgrade to 2.0 before turning off jsonpdataproxy and potentially breaking
> 1.x applications.
>
> Ross
>
>
> On 26 Apr 2013, at 18:33, Darwin Peltan wrote:
>
> I'd be +1 for removing the dataproxy and prompting user to download rather
> than the current behaviour where the user see's a failed attempt to
> preview. Is there any reason to keep the dataproxy?
>
> Darwin Peltan
> Project Manager
>
> The Open Knowledge Foundation
> http://www.okfn.org
>
> Skype: darwinp
> Twitter: @darwin
>
>
> On 26 April 2013 08:47, Sean Hammond <sean.hammond at okfn.org> wrote:
>
>> Good thoughts Nigel. Actually, I wonder if it would be worth adding this
>> to this wiki page?
>>
>>
>> https://github.com/okfn/ckan/wiki/Spec:-DataStore-and-FileStore-Consolidation
>>
>> On Thu, Apr 18, 2013 at 07:58:50PM +0530, Nigel Babu wrote:
>> > Ah, this is also connected to the dataproxy discussions.
>> >
>> > Currently, there's no way for CKAN to know that something was attempted
>> to
>> > be loaded into the datastore and it failed. Almost all the failures I've
>> > noticed in production are because of the file, i.e. not a CKAN/datastore
>> > issue.  When a user attempts to preview the resource, there will be
>> nothing
>> > in the datastore for this resource and CKAN attempt to use the dataproxy
>> > and fails.
>> >
>> > This gives us two options
>> > 1) Remove dataproxy. If a file isn't in the datastore, there's something
>> > wrong with it and it can't be loaded. Offer a download link instead.
>> > 2) Have a way to mark that a file was attempted to be loaded into the
>> > datastore and it failed. If the file isn't in the datastore and this
>> > failure is marked, dataproxy should not attempt to preview and only
>> offer a
>> > download link.
>> >
>> > The datastore_upload[1] script will ignore the resource if the download
>> of
>> > the resource fails. That means if there was an existing entry in the
>> > datastore, it will continue to remain in the datastore. If a file was
>> > updated, the datastore will be updated on the next run.  The delete
>> > behaviour may not be entirely appropriate and should probably discussed
>> > further.
>> >
>> > [1]
>> >
>> https://github.com/okfn/ckanext-datastorer/blob/master/ckanext/datastorer/commands.py#L193
>> >
>> > Nigel.
>> >
>> >
>> >
>> > On 18 April 2013 19:07, Sean Hammond <sean.hammond at okfn.org> wrote:
>> >
>> > > > Take a look at this dataset on publicdata.eu:
>> > > >
>> > > > http://publicdata.eu/dataset/ministerial-data-cabinet-office
>> > > >
>> > > > If you click on any of the resources you'll get an error:
>> > > >
>> > > > Could not load preview: DataProxy returned an error (Data
>> transformation
>> > > > failed. HTTPError: HTTP Error 404: Not Found)
>> > > >
>> > > > and if you try to download any of the resource files from the source
>> > > > site you'll find they no longer exist, eg:
>> > > >
>> > > >
>> > >
>> http://www.cabinetoffice.gov.uk/sites/default/files/resources/pm-meetings.csv
>> > > >
>> > > > Related to the new work that's being done around the new datastorer
>> > > > service (data pusher is its current name I think) and the new
>> datastorer
>> > > > paster command/cron job:
>> > > >
>> > > > I'm not sure how we intend to deal with this problem in CKAN --
>> when a
>> > > > resource file is linked to, and then the source file on the remote
>> site
>> > > > moves or disappears. Once we have the datastorer service and script
>> > > > stuff sorted out, then it can be deployed and a resource file like
>> this
>> > > > would have been pulled into the datastore so could be previewed
>> from the
>> > > > datastore. But what should the datastorer do, when it finds that the
>> > > > original source file is gone? Should it leave the data in the
>> datastore,
>> > > > so that preview and data API keep working? Or should it delete the
>> data
>> > > > in the datastore, and have the resource page display some clear
>> error
>> > > > message that says the source file is no longer there?
>> > >
>> > > Ping. This seems related to the datapusher discussion we had this
>> > > morning
>> > >
>> > > _______________________________________________
>> > > ckan-dev mailing list
>> > > ckan-dev at lists.okfn.org
>> > > http://lists.okfn.org/mailman/listinfo/ckan-dev
>> > > Unsubscribe: http://lists.okfn.org/mailman/options/ckan-dev
>> > >
>>
>> > _______________________________________________
>> > ckan-dev mailing list
>> > ckan-dev at lists.okfn.org
>> > http://lists.okfn.org/mailman/listinfo/ckan-dev
>> > Unsubscribe: http://lists.okfn.org/mailman/options/ckan-dev
>>
>>
>> _______________________________________________
>> ckan-dev mailing list
>> ckan-dev at lists.okfn.org
>> http://lists.okfn.org/mailman/listinfo/ckan-dev
>> Unsubscribe: http://lists.okfn.org/mailman/options/ckan-dev
>>
>
> _______________________________________________
> ckan-dev mailing list
> ckan-dev at lists.okfn.org
> http://lists.okfn.org/mailman/listinfo/ckan-dev
> Unsubscribe: http://lists.okfn.org/mailman/options/ckan-dev
>
>
>
> _______________________________________________
> ckan-dev mailing list
> ckan-dev at lists.okfn.org
> http://lists.okfn.org/mailman/listinfo/ckan-dev
> Unsubscribe: http://lists.okfn.org/mailman/options/ckan-dev
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.okfn.org/pipermail/ckan-dev/attachments/20130429/47b2cdaf/attachment-0001.html>


More information about the ckan-dev mailing list