[ckan-dev] Future, flask, breaking things, funding.

Steven De Costa steven.decosta at linkdigital.com.au
Mon Sep 14 21:11:10 UTC 2015


I'm 'all in' on this discussion :) I'll setup a doodle and we can pick a
time to do a video call...

My 2c on some points.

1. Perhaps redev could be bottom up. Start with resources and widen its
ability. Crud can then be rebuilt over the top.
2. Carefully consider the longest term possible and how the app may mature
in the future.
3. Consider interoperability between n+1 platforms via linked open data,
again with realtime in mind
4. Consider packages further. Could we add new package types that are built
on 3.0 thinking and have them co exist with current packages? If so then
existing extensions could be modified less dramatically to apply only to v2
packages.
5. Think about migration scenarios. Could a v2 CKAN remain as a dumb web
app harvesting from a 3.0? If so, we could priorities workflows around
custodians and ETL before end users.
6. Yes I'm sure others in the steering group would support the work. Just
remember they are also just volunteers :)
7. Yes I'm sure funding could come from the Association, just so long as
funding first goes into the association. So, we'd all have a part to play
in signing up paying members - happy to take any leads from people on that
point :)

Hoots!

On Tuesday, September 15, 2015, Denis Zgonjanin <deniszgonjanin at gmail.com>
wrote:

> Yes, we should think of use cases. Realtime data is just one. I'm not just
> talking about things we might want to do. Here are the current things in
> CKAN that would benefit from better asynchronous support:
>
> - Datastore & Datapusher. We could integrate datapusher into CKAN, so
> people don't need to set up an additional web service just to use stock
> CKAN.
> - Harvesting. Set up a periodic callback that calls harvest sources every
> hour. Super easy when compared to having to set up reddit/ZeroMQ, and
> another 3(!) long-running processes running in the background.
> - Webhooks. They must be pushed off to a celery queue because of Pylons.
> With async they could be fired off easily.
> - Analytics & analytics reports; Sending automated emails and other
> automated tasks.
> - Anything where right now we have to set up cron jobs.
>
> And probably most importantly - CKAN is going to need a face lift
> eventually if it's to remain relevant. It can't be stuck in CRUD land
> forever. There is plenty of time for this, no rush. But building cool
> shinny new things with fancy front-end javascript would be hard right now.
> It will be hard on any web framework built on the idea that your whole
> application context is transferred to the user on every HTTP request, and
> that nothing else except that is going on in the backend.
>
>
> On Mon, Sep 14, 2015 at 9:34 AM, Stéphane Guidoin <
> stephane.guidoin at gmail.com
> <javascript:_e(%7B%7D,'cvml','stephane.guidoin at gmail.com');>> wrote:
>
>> *Now that government is (slowly) catching on, more stream, API, and even
>> real-time data is being published. CKAN doesn't do a great job here. The
>> biggest obstacle to creating nice extensions to CKAN for non-file data is
>> that Pylons is still firmly stuck within the HTTP request-response
>> lifecycle. *
>>
>> I wonder what should be the role of CKAN when it comes to APIs, streams
>> and other things. Those stuff tend to be fairly resource intensive and most
>> of the time, they are developed and hosted on their own, not on the open
>> data portal. So what should be the role of CKAN on this? How much do we
>> want to be able to integrate CKAN with APIs and streams, what should it
>> give?
>>
>> From my point of view, moving to Flask or other, framework is mostly a
>> question of technical debt (
>> https://18f.gsa.gov/2015/08/07/technical-debt-1/) and making sure CKAN
>> remains flexible (and build-in async would indeed help)
>>
>> When it comes to see how to support realtime data, even if it's to mainly
>> enable extension development, some thinking about use case is needed in
>> order to avoid jumping into something that would be very time intensive in
>> terms of dev.
>>
>> Stéphane
>>
>>
>>
>> On 2015-09-14 08:57, Denis Zgonjanin wrote:
>>
>> Right now CKAN is great for static sources of data, which is really all
>> that existed from government sources when CKAN was first written.
>>
>> Now that government is (slowly) catching on, more stream, API, and even
>> real-time data is being published. CKAN doesn't do a great job here. The
>> biggest obstacle to creating nice extensions to CKAN for non-file data is
>> that Pylons is still firmly stuck within the HTTP request-response
>> lifecycle.
>>
>> This worked well for CRUD apps, but now is really showing it's
>> limitations. It's hard to do anything in CKAN that doesn't take place
>> within the context of a user's HTTP request. If you want to do some extra
>> data processing on the side, you have to use celery queues or worse, cron.
>> Worse yet, some people do try to put extra processing inside the
>> request-response lifecycle, causing problems.
>>
>> Even core CKAN is guilty of this. For example, CKAN will call datapusher
>> to send upload jobs and retrieve job results, and those requests to
>> datapusher happen while the user is waiting for the request to return. This
>> is kind of terrible. Not even because somebody did it this way, but because
>> CKAN doesn't give you a sane alternative to do it properly.
>>
>> Porting CKAN to flask is no small feat, so let's make sure we do it
>> right. Now that we're not using CKAN to just host static files anymore, we
>> need to have better, built-in async support in CKAN. Perhaps this means
>> moving to Python 3 where we'll have asyncio (and hopefully a future version
>> of flask will work well with it). Other frameworks, like tornado, are also
>> quite lightweight and support this out of the box for python 2.x.
>>
>> - Denis
>>
>>
>> On Mon, Sep 14, 2015 at 3:56 AM, Angelos Tzotsos <gcpp.kalxas at gmail.com
>> <javascript:_e(%7B%7D,'cvml','gcpp.kalxas at gmail.com');>> wrote:
>>
>>> On 09/14/2015 10:24 AM, Ross Jones wrote:
>>>
>>>> Hi,
>>>>
>>>> I’ve recently been playing about with implementing parts of CKAN in
>>>> Flask side-by-side with the current Pylons implementation. I’m doing it
>>>> like this so that it isn’t immediately obvious that there’s a migration
>>>> happening towards using Flask (aka nothing breaks).  I don’t think this
>>>> branch should ever be merged, it’s more exploratory but it has raised some
>>>> questions that I think it would be good to discuss.
>>>>
>>>> WARNING:anecdata
>>>> It’s pretty clear that the vast majority of people asked would like to
>>>> move to Flask as a replacement for some layers of the system (leaving
>>>> things like logic and plugins alone).
>>>> ENDWARNING
>>>>
>>>> We’ve discussed at the tech-team meetings, but I think a longer, more
>>>> accessible conversation would be beneficial.
>>>>
>>>> 1. What version of CKAN should be targeted? Common sense suggests 3.0,
>>>> but that being the case, exactly how far can we go in breaking some
>>>> backward compatibility?  This isn’t really a technical question - would be
>>>> good to hear what the community would accept …
>>>>
>>>> 2. Does it *really* need to be side-by-side?  Running Flask and Pylons
>>>> side-by-side means staying on Python 2 for another few years (because
>>>> Pylons).  A reasonably deep incision and removal of non-logic/non-plugin
>>>> code would make a move to Py3 easier, but with some level of breakage in
>>>> external plugins. Staying on 2 would mean a move to 3 at a later date and
>>>> more pain.
>>>>
>>>> 3. Would the CKAN Association like to fund someone to do some of this
>>>> work? This is just one of several ideas mentioned on
>>>> https://github.com/ckan/ideas-and-roadmap/issues/152 that really needs
>>>> to be done if CKAN is going to thrive instead of just survive.
>>>>
>>>> Any feedback welcome…
>>>>
>>>> Cheers
>>>>
>>>> Ross.
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> ckan-dev mailing list
>>>> ckan-dev at lists.okfn.org
>>>> <javascript:_e(%7B%7D,'cvml','ckan-dev at lists.okfn.org');>
>>>> https://lists.okfn.org/mailman/listinfo/ckan-dev
>>>> Unsubscribe: https://lists.okfn.org/mailman/options/ckan-dev
>>>>
>>>
>>> Hi Ross,
>>>
>>> I believe that a Flask port (or rewrite) is an excellent idea for CKAN
>>> 3.0 in order to support Python 3.x
>>> The alternative would be to port Pylons to Python 3.x, which perhaps is
>>> a more difficult task...
>>>
>>> Given that Python 2.x will EOL relatively soon, CKAN should move forward.
>>>
>>> Just my 2 cents.
>>>
>>> Best,
>>> Angelos
>>>
>>> --
>>> Angelos Tzotsos, PhD
>>> OSGeo Charter Member
>>> http://users.ntua.gr/tzotsos
>>>
>>>
>>> _______________________________________________
>>> ckan-dev mailing list
>>> ckan-dev at lists.okfn.org
>>> <javascript:_e(%7B%7D,'cvml','ckan-dev at lists.okfn.org');>
>>> https://lists.okfn.org/mailman/listinfo/ckan-dev
>>> Unsubscribe: https://lists.okfn.org/mailman/options/ckan-dev
>>>
>>
>>
>>
>> _______________________________________________
>> ckan-dev mailing listckan-dev at lists.okfn.org <javascript:_e(%7B%7D,'cvml','ckan-dev at lists.okfn.org');>https://lists.okfn.org/mailman/listinfo/ckan-dev
>> Unsubscribe: https://lists.okfn.org/mailman/options/ckan-dev
>>
>>
>>
>> _______________________________________________
>> ckan-dev mailing list
>> ckan-dev at lists.okfn.org
>> <javascript:_e(%7B%7D,'cvml','ckan-dev at lists.okfn.org');>
>> https://lists.okfn.org/mailman/listinfo/ckan-dev
>> Unsubscribe: https://lists.okfn.org/mailman/options/ckan-dev
>>
>>
>

-- 
*STEVEN DE COSTA *|
*EXECUTIVE DIRECTOR*www.linkdigital.com.au
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.okfn.org/pipermail/ckan-dev/attachments/20150915/8662f1ab/attachment-0003.html>


More information about the ckan-dev mailing list