[ckan-dev] linking data in private S3 buckets

Ian Ward ian at excess.org
Wed Dec 17 13:45:12 UTC 2014


On Wed, Dec 17, 2014 at 5:50 AM, Anton Lundin <anton at dohi.se> wrote:
> On 24 February, 2014 - Nigel Babu wrote:
>
>> We don't have a timetable yet. It's still in the planning stage. We will
>> definitely ask for comments on the list when we have a solid plan of how
>> it's going to be implemented. You'll also be able to follow the bug when we
>> start work.
>
> Started to take a look at this again. Is there any plans on how to
> re-introduce other file storages than local filesystem again?
...
> Another use case that this move to local-only-storage breaks is if you
> would like to have a scaling farm of webservers. Then you would need to
> involve a network filesystem to keep the filestorage consistent across
> all webservers.
> It also moves the heavy lifting, actually sending the files, to the
> webserver, away from the storage solution optimized for this exact task.

I agree. Passing files through the web server isn't ideal.

Unfortunately, when users are uploading files to a private dataset
they have an expectation that that file will be kept private. I don't
know how to solve that when the files are stored on s3 or another
service.

When then old code was removed I remember is was suggested we could
add a plugin interface that would allow moving the file to a remote
service as a queued background task, then when that is complete update
the link in the resource. That approach should still work, and allow
things like the datapusher to continue to work as well. Uploading to a
remote service would have to be disabled for non-public datasets,
though.

Even better would be to allow uploads directly to that remote service.
That would be a trickier interface to build (recognising incomplete
uploads, etc), and it's not clear how to support private datasets, but
it should perform much better and be much simpler on the web server
side.

Are you interested in contributing development in one of these directions?

Ian



More information about the ckan-dev mailing list