[ckan-dev] Performance CKAN / Configuration Parameters

Tue Aug 19 11:42:55 UTC 2014

Hello Lothar,

How many datasets does your test instance have? Are you using many
extra fields? tag vocabularies? Which pages are you hitting?

There are lots of places that CKAN can get slow but that are easy to
fix or work around. Most instances don't push CKAN very hard so we
tend to fix the problems only as they come up. If you're hitting some
of those problem areas and tweaking server settings isn't going to
help much.

Would you help us to reproduce your test environment?

On Fri, Aug 15, 2014 at 12:47 PM, hotz <hotz at informatik.uni-hamburg.de> wrote:
> Hi Alice,
>
> thank you very much for your reply!
> I'm still checking out things and will later answer in more detail.
>
> For now:
> - we enhance the maximal connections of Postgres from default 100 to 120
> connections
> on a 2GB development machine and to 1000 connections with 24 GB shared
> memory
> on a 32 GB production machine. This enabled (of course) more concurrent
> requests.
> (If the connections are getting too many there is a log entry in CKAN and in
> the DB-log,
> which helped tuning.)
>
> - we moved from apache/prefork to apache/worker according to
> [http://blog.dscpl.com.au/2009/03/load-spikes-and-excessive-memory-usage.html].
> However, I'm still not sure if this is needed because
> mod_wsgi runs in daemon mode in our case.
>
> - with this configuration, we currently get on a 8 core machine, 24GB (the
> Postgres is on a different machine,
> apache server with mod_wsgi, no datastore, no varnish) and the following
> scenario the data shown below [1]:
>
> Scenario: Start with 128 concurrent users, each clicks on three pages
> (start, query, detail of one dataset), waits 5 seconds,
> starts again. This, 5 minutes long. "Users" at two locations (Hamburg and
> Bremen).
> --> Page loading is between 35s to 75s in average. :-(
> --> but no failure. :-)
>
> With 16 concurrent users, we got page loads way below 10s.
> With more than 128 users we got failures, which still have to be analyzed.
>
> Next steps: with varnish, tune DB, Apache and WSGI according to your
> suggestions.
>
> Perhaps use of uwsgi instead of mod_wsgi (if CKAN allows it..).
>
> That's for now, best wishes and thanx again!
> Lothar
>
> [1]
>
> ===================================
> 5 minutes run
> 128 users starting(two locations, HH and HB).
> Three pages: start, query, show detail.
> Two page (start and query) reported below.
> "Test" meaning one run through the scenario.
>
>                HH     HB
> Users          64     64
> Req/sec        37,9   31,3
> #Tests        119     83
> Page Start    38,9s   47,6s   (average)
> Page Query    57,1s   75,8s   (average)
> Failure        0       0
>
> ===================================
>
>
> Apache configuration:
>
> KeepAlive OFF
>
> <IfModule mpm_worker_module>
>     StartServers          50
>     MinSpareThreads      25
>     MaxSpareThreads      2500
>     ThreadLimit          2500
>     ThreadsPerChild      50
>     ServerLimit         1300
>     MaxClients          1300
>     MaxRequestsPerChild  0
> </IfModule>
>
> WSGI:
> processes=2 threads=120
>
> ===> This might be the reason that we get failures if 256 users
> have to be served.
>
> 1300 MaxClients/ 50 ThreadsPerChild = 26 processes
>
> ps -dealf | grep apache | grep www_data | wc
> --> 27 Apache processes (1 wait)
>
> ps -dealf | grep ckan_default | grep www_data | wc
> --> 2 ckan_default processes
>
> ===> This has to be aligned. It looks like that 26 apache processes
> send requests to 2 ckan_default processes. Main question: what are
> the relations behind apache processes
>
>
>
> Am 11.08.2014 12:08, schrieb Alice Heaton:
>
>> Hello,
>>
>> You may already be aware of these things, but just to throw in some ideas:
>>
>> - A single process will only run on a single CPU so by setting
>> 'processes=2' it means your CKAN application will only run on 2 CPUs.
>>   This might be what you want (to reserve the other CPUs for
>> Postgres/Jetty etc.), but good to keep in mind;
>>
>> - I don't think ServerLimits affects wsgi daemon mode - though I may be
>> wrong about this;
>>
>> - With processes=2 and threads=30, you will serve at most 2*30 concurrent
>> requests;
>>
>> - It's important to remember that all requests, including those for static
>> files, go through mod_wsgi.
>>   If browsers are firing, say, 8 concurrent requests then that leaves you
>> with 2*30/8 concurrent clients
>>   (this is very approximate, it will depend on the time taken for each
>> request, client caching, etc. however
>>   it's a good way to get an idea of what is happening). The best way to
>> deal with this is to add a caching
>>   server in front (say nginx or varnish) to ensure static files are
>> cached;
>>
>> - You don't mention PostgreSQL settings, and whether you use the datastore
>> (and if so with how many rows).
>>    On our setup (with the datastore and tables with over 3,000,000 rows),
>> PostgreSQL is the slow point.
>>
>>    The default PostgreSQL settings are very conservative. The first thing
>> to do there is to increase shared_buffers -
>>    the recommended value is about 25% of available memory. The next one to
>> set in effective_cache_size, this
>>    should be roughly shared_buffers plus the amount of system caches.
>>
>>    What will make a real difference for a large database is to set
>> work_mem. This has to be tuned carefully, as you are
>>    setting the memory available for each operation in a query - so a query
>> with 12 joins will use up to 12*work_mem.
>>    If you set this to low, then your sorts/joins will happen on disc -
>> which can be very slow. If you set this to high, you might
>>    run out of memory!
>>
>>    The best way to work this out is to enable slow query logging, and look
>> for the slow queries. explain analyze will tell you
>>    how much memory they need, and whether the operations happen on disc or
>> in memory. Increase work_mem to make them
>>    happen in memory (if possible).
>>
>> - In postgres, you should also check the number of allowed connections.
>> Depending on your settings/plugins, CKAN may make more
>>   than one connection per request. With 2*30 workers, if each worker makes
>> more than one connection then you will run out.
>>
>> I'm interested in hearing about anything else you find that affects
>> performance, so please let us know !
>>
>> Best Wishes,
>> Alice Heaton
>>
>> On 06/08/14 18:54, hotz wrote:
>>>
>>> Hi all,
>>>
>>> we do performance tests with following setup:
>>> - CKAN 2.1.2
>>> - Search queries via the default ckan-portal and a web-portal
>>> - Ramp tests of 50,100,200...600...1000,...5000 users per second (!)
>>>   (we expect such numbers in the early on-line phase of our portal)
>>> - 24 GB RAM, 8 CPUs
>>>
>>> Following parameter configurations in:
>>> 1) apache2.conf
>>>  ServerLimits 300
>>>  MaxClients 300
>>>  for all occurrences
>>>
>>> 2) Jetty Java_Options:
>>> -Xms512M -Xmx4g
>>>
>>> 3) virtual host ckan_default:
>>> WSGIDaemonProcess ckan_default display-name=ckan_default processes=2
>>> threads=30
>>>
>>> We get response time of ca. mean 30 seconds each of 600 concurrent users
>>> per second.
>>> And several errors. Which altogether we feel bad with.
>>>
>>> The CPUs are 50% active, RAM uses only ca. 5GB (of the 24 GB).
>>> The ckan-portal and web-portal have same results.
>>>
>>>
>>> Is there somebody who can explain the above parameters and optimal
>>> settings of them and their influences?
>>> E.g. do apache workers correspond to threads? Are there multiple jetty
>>> processes or only one if CKAN is running?
>>> Has anybody experiences in this direction or hints to further
>>> information?
>>>
>>> Best wishes,
>>> Lothar
>>>
>>>
>>>
>>
>> _______________________________________________
>> ckan-dev mailing list
>> ckan-dev at lists.okfn.org
>> https://lists.okfn.org/mailman/listinfo/ckan-dev
>> Unsubscribe: https://lists.okfn.org/mailman/options/ckan-dev
>
> _______________________________________________
> ckan-dev mailing list
> ckan-dev at lists.okfn.org
> https://lists.okfn.org/mailman/listinfo/ckan-dev
> Unsubscribe: https://lists.okfn.org/mailman/options/ckan-dev