[ckan-dev] linking data in private S3 buckets

Nigel Babu nigel.babu at okfn.org
Thu Feb 20 04:14:48 UTC 2014


I'd also like to add that the old filestore had concurrency problems when
storing locally. We have had a few people report this and it would have
potentially lost some data. Additionally, This change removed that
possibility.

The storage API was implemented as a library which gave you the option of 3
backends, unlike what CKAN usually does. We would usually give an interface
in which we implemented a backend and provided hooks for other backends.
We're thinking of going this route for the future. When we have a concrete
idea for implementation, I'll create a thread on the mailing list for
discussion.

I'd like to apologize for the not-great communication about the filestore
changes and I'll make sure we try to be better in the future.

Nigel Babu

Developer  |  @nigelbabu <https://twitter.com/nigelbabu>

The Open Knowledge Foundation <http://okfn.org/>

Empowering through Open Knowledge

http://okfn.org/  |  @okfn <http://twitter.com/OKFN>  |  OKF on
Facebook<https://www.facebook.com/OKFNetwork> |
Blog <http://blog.okfn.org/>  |  Newsletter<http://okfn.org/about/newsletter>

 CKAN | http://ckan.org/ | @CKANproject
<http://twitter.com/CKANproject> |the world’s leading open-source data
portal platform


On 20 February 2014 00:06, David Raznick <david.raznick at okfn.org> wrote:

> On 19 February 2014 14:37, Fawcett, David (MNIT)
> <David.Fawcett at state.mn.us> wrote:
> > Nigel,
> >
> >
> >
> > This seems like a big functionality change for a  'minor' release.  I
> don't
> > want to question this specific case, but I think that it is useful for
> the
> > developers to think about their strategy for adding and changing
> > functionality and how that affects maintainability for the user
> community.
> >
>
> We are normally very good at not deprecating functionality and really
> did not take this decision lightly.   However, we thought we
> absolutely needed to make this change.  We spend a lot of time and
> effort making sure we keep backwards compatibility in all cases.
>
> For this particular case the following points should be noted about
> this upgrade.
>
> *  The old apis are kept, so accessing previously uploaded files and
> even uploading using the old api should not be broken.  If they are
> broken then this a regression and we would definitely try and fix
> this.
> *  The only real change that was made was to make the user interface
> use the new upload mechanism.  It should be possible to resurrect that
> old UI if absolutely needed but we generally only maintain a single
> UI.
> *  We added a migration path see the docs here.
>
> http://docs.ckan.org/en/latest/maintaining/filestore.html#migration-from-2-1-to-2-2
> and this is mentioned in the release notes.
> *  We have been trying to find other ways to use external services
> like s3, for example like https://github.com/ckan/ckanext-s3archive
> (which I have not had the time to officially announce yet), which I
> think is the best solution to this problem currently.
> *  If using local storage you can keep using your old setting and it
> works seamlessly.
>
> The main reasons for this change was that pre 2.2 data uploaded was:
>
> * Never private.
> * Was not a mechanism to delete or replace existing uploaded data.
>
> These were a top priority functionality for many people and is
> expected behaviour for most.  The developers working on this could not
> find a good general solution for this with our old implementation
> (which also supported many other backends that we could not support
> with these features) so we felt we had to come up with a new way,
> trying not make migrating as easy as possible.
>
> >
> > One cool thing about CKAN is that it is a healthy project.  Quite a few
> > committers, frequent commits and new features.  The flip side is that
> some
> > of these new features and fixes are not backwards-compatible between
> minor
> > versions.
>
> I do not really see us making another major release for any years, or
> potentially never, so all changes have to be in these releases.  We do
> patch at least 2 versions behind the latest and would hope to get any
> feature regressions fixed by that point.
>
> > This makes it difficult for people who are implementing big
> > enterprise systems based on CKAN, particularly the organizations that we
> > really want to see sharing their data, governments. Somewhere in there
> is a
> > balance.
> >
> >
> >
> > CKAN is gaining significant market share in the open data space, and
> that is
> > great.  If people decide that it is too hard to maintain an instance
> because
> > minor version upgrades break their install, I expect to see fewer people
> > choosing CKAN for their new projects.
> >
> >
> >
> > Don't get me wrong.  I like CKAN and we are happy that it is there for
> us to
> > use in our project.  I just want to suggest that more thought and
> > communication is put into when major and breaking changes are released.
> > (The publishing of the weekly meeting notes is a great start.)
>
> I absolutely agree that our communication about this issue could have
> been better. (so I am hoping to make up for this bit now).  We did not
> actually consider a breaking change especially for local storage.
>
> Also we should have done a bit more of a survey about who where using
> the s3 backend, before we made this change, as we had not heard of
> anyone actually using it before this point.  So apologies there for
> all those who are affected.
>
> Thanks
>
> David
>
>
>
> >
> >
> >
> > Thanks,
> >
> >
> >
> > David.
> >
> >
> >
> >
> >
> >
> >
> > From: ckan-dev [mailto:ckan-dev-bounces at lists.okfn.org] On Behalf Of
> Nigel
> > Babu
> > Sent: Wednesday, February 19, 2014 4:20 AM
> > To: CKAN Development Discussions
> > Subject: Re: [ckan-dev] linking data in private S3 buckets
> >
> >
> >
> > Hello Ivan,
> >
> > On ckan 2.2 and above, we removed the support for external filestores.
> Only
> > local filestores are supported. The old implementation was causing more
> > trouble than it's worth. We will, in the future, build an interface for
> > extensions to support multiple external filestores.
> >
> >
> > Nigel Babu
> >
> > Developer  |  @nigelbabu
> >
> > The Open Knowledge Foundation
> >
> > Empowering through Open Knowledge
> >
> > http://okfn.org/  |  @okfn  |  OKF on Facebook  |  Blog  |  Newsletter
> >
> >
> >
> > CKAN | http://ckan.org/ | @CKANproject | the world's leading open-source
> > data portal platform
> >
> >
> >
> > On 12 February 2014 21:06, Ivan <vanzaj at gmail.com> wrote:
> >
> > Hello,
> >
> > Sorry if I'm missing something obvious. I can't find any info in the
> docs,
> > wikis, github issues, or elsewhere.
> > Is there a way to create a private dataset linked to a file stored in a
> > private S3 bucket?
> >
> > I have ofs.aws_access_key_id, and ofs.aws_secret_access_key in my
> > <deploy>.ini, but it doesn't seem to be enough (i know it's not an auth
> > issue as s3cmd with the same keys from the same host works fine). This
> is on
> > ckan 2.3a.
> >
> > thanks,
> > Ivan
> >
> >
> > _______________________________________________
> > ckan-dev mailing list
> > ckan-dev at lists.okfn.org
> > https://lists.okfn.org/mailman/listinfo/ckan-dev
> > Unsubscribe: https://lists.okfn.org/mailman/options/ckan-dev
> >
> >
> >
> >
> > _______________________________________________
> > ckan-dev mailing list
> > ckan-dev at lists.okfn.org
> > https://lists.okfn.org/mailman/listinfo/ckan-dev
> > Unsubscribe: https://lists.okfn.org/mailman/options/ckan-dev
> >
> _______________________________________________
> ckan-dev mailing list
> ckan-dev at lists.okfn.org
> https://lists.okfn.org/mailman/listinfo/ckan-dev
> Unsubscribe: https://lists.okfn.org/mailman/options/ckan-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.okfn.org/pipermail/ckan-dev/attachments/20140220/9422172c/attachment-0003.html>


More information about the ckan-dev mailing list