[ckan-dev] linking data in private S3 buckets

David Raznick david.raznick at okfn.org
Wed Feb 19 18:36:40 UTC 2014


On 19 February 2014 14:37, Fawcett, David (MNIT)
<David.Fawcett at state.mn.us> wrote:
> Nigel,
>
>
>
> This seems like a big functionality change for a  'minor' release.  I don't
> want to question this specific case, but I think that it is useful for the
> developers to think about their strategy for adding and changing
> functionality and how that affects maintainability for the user community.
>

We are normally very good at not deprecating functionality and really
did not take this decision lightly.   However, we thought we
absolutely needed to make this change.  We spend a lot of time and
effort making sure we keep backwards compatibility in all cases.

For this particular case the following points should be noted about
this upgrade.

*  The old apis are kept, so accessing previously uploaded files and
even uploading using the old api should not be broken.  If they are
broken then this a regression and we would definitely try and fix
this.
*  The only real change that was made was to make the user interface
use the new upload mechanism.  It should be possible to resurrect that
old UI if absolutely needed but we generally only maintain a single
UI.
*  We added a migration path see the docs here.
http://docs.ckan.org/en/latest/maintaining/filestore.html#migration-from-2-1-to-2-2
and this is mentioned in the release notes.
*  We have been trying to find other ways to use external services
like s3, for example like https://github.com/ckan/ckanext-s3archive
(which I have not had the time to officially announce yet), which I
think is the best solution to this problem currently.
*  If using local storage you can keep using your old setting and it
works seamlessly.

The main reasons for this change was that pre 2.2 data uploaded was:

* Never private.
* Was not a mechanism to delete or replace existing uploaded data.

These were a top priority functionality for many people and is
expected behaviour for most.  The developers working on this could not
find a good general solution for this with our old implementation
(which also supported many other backends that we could not support
with these features) so we felt we had to come up with a new way,
trying not make migrating as easy as possible.

>
> One cool thing about CKAN is that it is a healthy project.  Quite a few
> committers, frequent commits and new features.  The flip side is that some
> of these new features and fixes are not backwards-compatible between minor
> versions.

I do not really see us making another major release for any years, or
potentially never, so all changes have to be in these releases.  We do
patch at least 2 versions behind the latest and would hope to get any
feature regressions fixed by that point.

> This makes it difficult for people who are implementing big
> enterprise systems based on CKAN, particularly the organizations that we
> really want to see sharing their data, governments. Somewhere in there is a
> balance.
>
>
>
> CKAN is gaining significant market share in the open data space, and that is
> great.  If people decide that it is too hard to maintain an instance because
> minor version upgrades break their install, I expect to see fewer people
> choosing CKAN for their new projects.
>
>
>
> Don't get me wrong.  I like CKAN and we are happy that it is there for us to
> use in our project.  I just want to suggest that more thought and
> communication is put into when major and breaking changes are released.
> (The publishing of the weekly meeting notes is a great start.)

I absolutely agree that our communication about this issue could have
been better. (so I am hoping to make up for this bit now).  We did not
actually consider a breaking change especially for local storage.

Also we should have done a bit more of a survey about who where using
the s3 backend, before we made this change, as we had not heard of
anyone actually using it before this point.  So apologies there for
all those who are affected.

Thanks

David



>
>
>
> Thanks,
>
>
>
> David.
>
>
>
>
>
>
>
> From: ckan-dev [mailto:ckan-dev-bounces at lists.okfn.org] On Behalf Of Nigel
> Babu
> Sent: Wednesday, February 19, 2014 4:20 AM
> To: CKAN Development Discussions
> Subject: Re: [ckan-dev] linking data in private S3 buckets
>
>
>
> Hello Ivan,
>
> On ckan 2.2 and above, we removed the support for external filestores. Only
> local filestores are supported. The old implementation was causing more
> trouble than it's worth. We will, in the future, build an interface for
> extensions to support multiple external filestores.
>
>
> Nigel Babu
>
> Developer  |  @nigelbabu
>
> The Open Knowledge Foundation
>
> Empowering through Open Knowledge
>
> http://okfn.org/  |  @okfn  |  OKF on Facebook  |  Blog  |  Newsletter
>
>
>
> CKAN | http://ckan.org/ | @CKANproject | the world's leading open-source
> data portal platform
>
>
>
> On 12 February 2014 21:06, Ivan <vanzaj at gmail.com> wrote:
>
> Hello,
>
> Sorry if I'm missing something obvious. I can't find any info in the docs,
> wikis, github issues, or elsewhere.
> Is there a way to create a private dataset linked to a file stored in a
> private S3 bucket?
>
> I have ofs.aws_access_key_id, and ofs.aws_secret_access_key in my
> <deploy>.ini, but it doesn't seem to be enough (i know it's not an auth
> issue as s3cmd with the same keys from the same host works fine). This is on
> ckan 2.3a.
>
> thanks,
> Ivan
>
>
> _______________________________________________
> ckan-dev mailing list
> ckan-dev at lists.okfn.org
> https://lists.okfn.org/mailman/listinfo/ckan-dev
> Unsubscribe: https://lists.okfn.org/mailman/options/ckan-dev
>
>
>
>
> _______________________________________________
> ckan-dev mailing list
> ckan-dev at lists.okfn.org
> https://lists.okfn.org/mailman/listinfo/ckan-dev
> Unsubscribe: https://lists.okfn.org/mailman/options/ckan-dev
>



More information about the ckan-dev mailing list