[ckan-dev] Unsubscribe

David Portnoy portnoy.david+ckan at gmail.com
Fri Mar 20 17:48:02 UTC 2015


On Fri, Mar 20, 2015 at 8:26 AM, <ckan-dev-request at lists.okfn.org> wrote:

> Send ckan-dev mailing list submissions to
>         ckan-dev at lists.okfn.org
>
> To subscribe or unsubscribe via the World Wide Web, visit
>         https://lists.okfn.org/mailman/listinfo/ckan-dev
> or, via email, send a message with subject or body 'help' to
>         ckan-dev-request at lists.okfn.org
>
> You can reach the person managing the list at
>         ckan-dev-owner at lists.okfn.org
>
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of ckan-dev digest..."
>
> Today's Topics:
>
>    1. CKAN/OpenSpending integration (Tryggvi Bj?rgvinsson)
>    2. Re: datastore - large queries (Alice Heaton)
>    3. Re: CKAN/OpenSpending integration (St?phane Guidoin)
>    4. Re: Previewing resources with Google Docs (Ross Jones)
>
>
> ---------- Forwarded message ----------
> From: "Tryggvi Björgvinsson" <tryggvi.bjorgvinsson at okfn.org>
> To: CKAN Development Discussions <ckan-dev at lists.okfn.org>
> Cc:
> Date: Fri, 20 Mar 2015 12:36:17 +0000
> Subject: [ckan-dev] CKAN/OpenSpending integration
> Hi all,
>
> I just posted a small blog post on the CKAN blog about two extensions
> that were made possible with the recent 2.3 release:
>
> http://ckan.org/2015/03/20/presenting-public-finance-just-got-easier/
>
> In the new release we now have the possibility to extend resources
> (files) in the same way we have been able to extend packages (datasets).
> This opens up the possibility to process uploaded files and do something
> with them.
>
> One of the extensions adds four fields to the the resource page in order
> to from there create something called budget data package (a
> standardised format for budget data publications)[1]. This makes
> integrations between CKAN and other budget systems very easy.
>
> The second extension takes the standardised budget data package (if
> published) and posts that to OpenSpending via its budget data package API.
>
> So it's now really easy to go from a budget file to an OpenSpending
> visualisation, and the only interface needed for managing it is CKAN
> (you still need to set up the OpenSpending visualisation site but that's
> very easy with the OpenSpendingJS visualisation library).
>
> We've already tested this with the Mexico 2015 budget which I then
> visualised with a demo here:
> http://tryggvib.github.io/mexican-budget-data-package
>
> It's basically a really simple html site, code here:
>
> https://github.com/tryggvib/mexican-budget-data-package/blob/gh-pages/index.html
>
> Just thought I'd let you all know :)
>
> /Tryggvi
>
> [1] http://fiscal.dataprotocols.org/
>
>
>
>
> ---------- Forwarded message ----------
> From: Alice Heaton <a.heaton at nhm.ac.uk>
> To: ckan-dev at lists.okfn.org
> Cc:
> Date: Fri, 20 Mar 2015 12:52:42 +0000
> Subject: Re: [ckan-dev] datastore - large queries
>  Hi,
>
> On 09/03/15 11:30, Alex Gartner wrote:
>
> Hi,
>
>  thank you for the response.
> I think that in the short-term we'll follow your advises about limiting
> requests from nginx and merging the PR. If we decide to go beyond that for
> a longer-term solution will let you know.
>
>
> Something else I just noticed going through our logs: you might want to
> block external access to /datastore/dump/<resource id> for large resources
> as this attempts to generate a CSV of the whole resource. On our 2.7M
> resource this would time out, but use a lot of memory in the process.
> Repeated access even below our request rate threshold would have been
> painful.
>
> Best Wishes,
> Alice
>
>  Thanks again,
> Alex
>
>
> On Fri, Mar 6, 2015 at 12:41 PM, Alice Heaton <a.heaton at nhm.ac.uk> wrote:
>
>>  Hello,
>>
>> There is no setting in CKAN (to my knowledge) to help with this. Things
>> we have done here:
>>
>> - Use nginx to limit the request rate (total and from a single IP) to the
>> datastore api;
>> - Ensure our servers can deal with as many requests as we allow (so worst
>> case the site is blocked, but the servers won't go down);
>> - Clear the response after each request. At the moment this doesn't
>> happen, so a worker keeps the last response in memory until it used again.
>> With 20 workers returning very large responses, this can kill your memory
>> very quickly. You'll need to merge this PR to do this:
>> https://github.com/ckan/ckan/pull/2262
>>
>> I have been thinking of imposing a hard limit on number of rows returned
>> per request. This could be implemented as a middleware which simply returns
>> a 400 error if there is a limit higher than a configured number. This is
>> not a priority for us, but we might come round to doing it at some point.
>> If you implement something like this we will definitely help with testing!
>>
>> Another approach currently under discussion is to have a
>> streaming/chunked API, so the response is only build as the client consumes
>> it. This is only at discussion phase, but worth keeping an eye on:
>> https://github.com/ckan/ideas-and-roadmap/issues/128
>>
>> Alice
>>
>>
>> On 06/03/15 00:32, Alex Gartner wrote:
>>
>>   Hi everyone,
>>
>>  I have a question related to the datastore API being used by a user
>> with *bad intentions* to achieve a denial of service of some kind. Since
>> the project that I'm working on plans to have datastore tables with around
>> 1 million rows I'm thinking this might be used against the system.
>> To give a few examples:
>>
>>    - The following request to the "datastore_search_sql" endpoint takes
>>     on my laptop around 2 minutes to complete for a datastore_table with 2 500
>>    rows and limit 250 000. Without the limit I imagine it would take around 50
>>    mins (There would be 2 500 x 2 500 rows in the response).
>>     - curl -G localhost:5000/api/action/datastore_search_sql
>>       --data-urlencode "sql=SELECT a.* from datastore_table a, datastore_table b
>>       limit 250000"
>>    - accessing the "datastore_search" endpoint with a limit of 250 000
>>    takes also about 2 mins ( for a table of around 500 000 rows )
>>     - curl "
>>       http://localhost/api/action/datastore_search?resource_id=resource_id&limit=250000
>>       "
>>
>>  I imagine that somebody hitting the datastore API with X simultaneous
>> requests for data with a limit of 1 million could block the server (while
>> using all the db connections).
>>
>>  Is there a way to set a hard limit that cannot be overwritten by the
>> user for the number of results returned by a query to the datastore (to
>> force pagination in way) ?
>>
>>  And in more general terms, what would be the best practice for avoiding
>> such issues ? Are there some CKAN settings that help with this ? Should we
>> setup the web server ( nginx, apache ) to use a harsher limit on the number
>> of simultaneous HTTP requests to the datastore API endpoints ?
>>
>>  Thanks for the help,
>> Alex
>>
>>
>>  _______________________________________________
>> ckan-dev mailing listckan-dev at lists.okfn.orghttps://lists.okfn.org/mailman/listinfo/ckan-dev
>> Unsubscribe: https://lists.okfn.org/mailman/options/ckan-dev
>>
>>
>>
>> _______________________________________________
>> ckan-dev mailing list
>> ckan-dev at lists.okfn.org
>> https://lists.okfn.org/mailman/listinfo/ckan-dev
>> Unsubscribe: https://lists.okfn.org/mailman/options/ckan-dev
>>
>>
>
>
> _______________________________________________
> ckan-dev mailing listckan-dev at lists.okfn.orghttps://lists.okfn.org/mailman/listinfo/ckan-dev
> Unsubscribe: https://lists.okfn.org/mailman/options/ckan-dev
>
>
>
>
> ---------- Forwarded message ----------
> From: "Stéphane Guidoin" <stephane at opennorth.ca>
> To: CKAN Development Discussions <ckan-dev at lists.okfn.org>
> Cc:
> Date: Fri, 20 Mar 2015 09:05:55 -0400
> Subject: Re: [ckan-dev] CKAN/OpenSpending integration
> Yes, indeed, I had contact with a municipality using CKAN and willing to
> have an openspending-like visualisation.
>
> Thanks Tryggvi!
>
> Stéphane
>
> On Fri, Mar 20, 2015 at 8:36 AM, Tryggvi Björgvinsson <
> tryggvi.bjorgvinsson at okfn.org> wrote:
>
>> Hi all,
>>
>> I just posted a small blog post on the CKAN blog about two extensions
>> that were made possible with the recent 2.3 release:
>>
>> http://ckan.org/2015/03/20/presenting-public-finance-just-got-easier/
>>
>> In the new release we now have the possibility to extend resources
>> (files) in the same way we have been able to extend packages (datasets).
>> This opens up the possibility to process uploaded files and do something
>> with them.
>>
>> One of the extensions adds four fields to the the resource page in order
>> to from there create something called budget data package (a
>> standardised format for budget data publications)[1]. This makes
>> integrations between CKAN and other budget systems very easy.
>>
>> The second extension takes the standardised budget data package (if
>> published) and posts that to OpenSpending via its budget data package API.
>>
>> So it's now really easy to go from a budget file to an OpenSpending
>> visualisation, and the only interface needed for managing it is CKAN
>> (you still need to set up the OpenSpending visualisation site but that's
>> very easy with the OpenSpendingJS visualisation library).
>>
>> We've already tested this with the Mexico 2015 budget which I then
>> visualised with a demo here:
>> http://tryggvib.github.io/mexican-budget-data-package
>>
>> It's basically a really simple html site, code here:
>>
>> https://github.com/tryggvib/mexican-budget-data-package/blob/gh-pages/index.html
>>
>> Just thought I'd let you all know :)
>>
>> /Tryggvi
>>
>> [1] http://fiscal.dataprotocols.org/
>>
>> _______________________________________________
>> ckan-dev mailing list
>> ckan-dev at lists.okfn.org
>> https://lists.okfn.org/mailman/listinfo/ckan-dev
>> Unsubscribe: https://lists.okfn.org/mailman/options/ckan-dev
>>
>
>
>
> --
> Stéphane Guidoin
> Director of product and service development
> Open North
> 514-862-0084
> http://opennorth.ca
> Twitter: @opennorth / @hoedic
>
>
> ---------- Forwarded message ----------
> From: Ross Jones <ross at servercode.co.uk>
> To: CKAN Development Discussions <ckan-dev at lists.okfn.org>
> Cc:
> Date: Fri, 20 Mar 2015 13:26:03 +0000
> Subject: Re: [ckan-dev] Previewing resources with Google Docs
> I've flipped it over to the office docs, the PDF is missing-in-action,but
> is the XLS at
> http://liverpool.servercode.co.uk/dataset/lcc-payments-of-invoices-to-vendors-over-500-april-2012/resource/a03edab7-ccbb-4af8-89e6-032716e56d0c any
> better?
>
> Thanks for your help in checking this - useful to know it isn't *just*
> working (or not) for me.
>
> Ross
>
>
> On 19 Mar 2015, at 20:45, Aaron McGlinchy <
> McGlinchyA at landcareresearch.co.nz> wrote:
>
> Hi, I tried the links below, the pdfs worked great, but the xls gave an
> error and did not load at all.
>
> That was using both Firefox and IE
>
> Cheers
> Aaron
>
> ----------------------------------------------------------------------
>
> Message: 1
> Date: Wed, 18 Mar 2015 19:09:04 +0000
> From: Ross Jones <ross at servercode.co.uk>
> To: CKAN Development Discussions <ckan-dev at lists.okfn.org>
> Subject: [ckan-dev] Previewing resources with Google Docs
> Message-ID: <FF432145-4B23-420E-87D0-B5C545F5A475 at servercode.co.uk>
> Content-Type: text/plain; charset="us-ascii"
>
> Hi,
>
> After finally getting fed up with watching the Javascript loading spinner
> when trying to view Excel files (and only seeing the first sheet when it
> did load) I decided to try using the Google Docs preview which you can
> embed into your pages  ( docs.google.com/preview?url= <
> http://docs.google.com/preview?url=><URL>&embedded=true for reference).
>
> So I wrote https://github.com/rossjones/ckanext-gdoc <
> https://github.com/rossjones/ckanext-gdoc> which is an IResourceView that
> does just that - it will allow you use use the google docs previewer for
> .doc, .xls, .xlsx and .pdf file.. Wrote is a strong word, there really
> isn't that much to it.
>
> I've deployed in on a test site I have been playing with today, and you
> can (hopefully) see the pdf previewing just fine at
> http://liverpool.servercode.co.uk/dataset/lcc-payments-of-invoices-to-vendors-over-500-april-2012/resource/3bdb30bd-afb1-4f68-a680-4bcb1bbf7174
> <
> http://liverpool.servercode.co.uk/dataset/lcc-payments-of-invoices-to-vendors-over-500-april-2012/resource/3bdb30bd-afb1-4f68-a680-4bcb1bbf7174
> >
>
> Unfortunately, *and totally defeating the point of the whole exercise*,
> Google Docs Previewer seems to be having a tough day with XLS files today -
> http://liverpool.servercode.co.uk/dataset/lcc-payments-of-invoices-to-vendors-over-500-april-2012/resource/a03edab7-ccbb-4af8-89e6-032716e56d0c
> <
> http://liverpool.servercode.co.uk/dataset/lcc-payments-of-invoices-to-vendors-over-500-april-2012/resource/a03edab7-ccbb-4af8-89e6-032716e56d0c
> >
>
> If anyone could find ten minutes to give it a try and see if it works for
> them, I'd be eternally grateful.  Well, for a while anyway, maybe not
> eternally.  If this does eventually work (it has before so I assume Google
> is just having a bad day) then I'll add some config options to allow you to
> specify which formats it will cope with (i.e. it also supports PPT but I
> refuse to believe it is possible to call anything in a PPT data).
>
> Cheers
>
> Ross
>
> p.s. I was impressed on doing my first IResourceView just how easy it was
> to do - much easier than I imagined it would be. Give it a try ....
>
> -------------- next part --------------
> An HTML attachment was scrubbed...
> URL: <
> http://lists.okfn.org/pipermail/ckan-dev/attachments/20150318/9caec859/attachment-0001.html
> >
>
> ------------------------------
>
> ________________________________
>
> Please consider the environment before printing this email
> Warning: This electronic message together with any attachments is
> confidential. If you receive it in error: (i) you must not read, use,
> disclose, copy or retain it; (ii) please contact the sender immediately by
> reply email and then delete the emails.
> The views expressed in this email may not be those of Landcare Research
> New Zealand Limited. http://www.landcareresearch.co.nz
> _______________________________________________
> ckan-dev mailing list
> ckan-dev at lists.okfn.org
> https://lists.okfn.org/mailman/listinfo/ckan-dev
> Unsubscribe: https://lists.okfn.org/mailman/options/ckan-dev
>
>
>
> _______________________________________________
> ckan-dev mailing list
> ckan-dev at lists.okfn.org
> https://lists.okfn.org/mailman/listinfo/ckan-dev
> Unsubscribe: https://lists.okfn.org/mailman/options/ckan-dev
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.okfn.org/pipermail/ckan-dev/attachments/20150320/1fbc306e/attachment-0002.html>


More information about the ckan-dev mailing list