[ckan-dev] Bootup performance

Alex Corbi a.corbi at gmail.com
Tue Apr 7 17:02:59 UTC 2015


Hi Ian,

I would like to add to my previous email that we are running a customized clone of CKAN, with ONLY following changes: https://github.com/OpenDevelopmentMekong/ckan/commits/master

Could you please review them and tell me if you see something strange? Maybe this here has something to do with the Issue: https://github.com/OpenDevelopmentMekong/ckan/commit/1791fe2dc7a65cdd819862ebc2ef7a9309469ebe

Anyway, we are going to deploy tomorrow our code based on a vanilla CKAN 2.3 and see if the Issue is still there.

Today, we have tried disabling auto_commit on Solr (as specified on https://github.com/ckan/ckan/wiki/Performance-tips-for-large-imports#solr) and our own theme (http://github.com/OpenDevelopmentMekong/ckanext-odm_theme) with same results = Extreme high CPU and I/O load between CKAN and Postgresql containers on startup.

Thanks in advance for the support, as always!

-- 
Alex Corbi

Am 7. April 2015 bei 14:13:08, Alex Corbi (a.corbi at gmail.com) schrieb:

Hi Ian,
> Web start up time should not depend on the number of datasets, and should be measured in second> s not minutes.
OK, this is key information. Because the Issue we are having is definitelly depending on the number of datasets stored.
> Are you running anything else during start up?  
Sometimes we have seen a need for restarting the SOLr container as well (running docker stop solr; docker stop ckan; docker start solr; docker start ckan; ). AS mentioned, we have the components of the architecture ( porstgresql, solr, ckan) separated in different Docker containers. What do you feel in general about using docker for deploying CKAN?
> Have you tried disabling plugins in your ini file? Have you made any changes to > ckan?
Here is the list of plugins that we currently have enabled on the production.ini file:

ckan.plugins = stats text_preview recline_preview pdf_preview datastore datapusher resource_proxy multilingual_dataset multilingual_group multilingual_tag odm_theme pages googleanalytics geojson_preview wms_preview
Being odm_theme, our own developed theme for UI customization and adding some logic, you can browse the code here: http://github.com/OpenDevelopmentMekong/ckanext-odm_theme Do you see something weird on the implementation?





Ian

-- 
Alex Corbi

Am 7. April 2015 bei 12:46:21, Alex Corbi (a.corbi at gmail.com) schrieb:

Hi Steve,

Thanks for you answer. Our current docker setup segregates indeed the different instances in different containers ( 1x postgresql, 1x ckan, 1x solr).

In order to compare… could you please tell me a bit about your CKAN instance:
- How many datasets are currently hosted?
- What are the specs of the machines where CKAN runs?
- How long does it take aprox. for the CKAN instance to boot after a reset or shutdown ?

-- 
Alex Corbi

Am 7. April 2015 bei 12:14:14, Alex Corbi (a.corbi at gmail.com) schrieb:

Hi Everyone,

I have a performance question depending on amount of stored datasets and bootup times.

In the context of http://data.opendevelopmentmekong.net/, which is an instance based on CKAN v2.2.1 deployed through uwsgi on a server with 2core and 4GB of memory (using docker containers), currently with VERY low traffic and 13 datasets hosted. In this scneario, bootup times after restart of the docker container for CKAN are quick and does not present any issue.

However, the bootup time and derivated Issues increase considerably with the number of datasets. On the very same setup, but being populated with ~2000 datasets, the CKAN instance takes up to 20 minutes to boot and sometimes shows an erratic behaviour after rebooting ( Internal Server Errors, random URLs and resources not being loaded).

So, here my questions:
- Do the characteristics of the described system (number of datasets, traffic) comply with the "small to medium" instance type mentioned on https://github.com/ckan/ckan/wiki/Hardware-Requirements? Are 2 core/4GB mem ok?

- During the bootup process, activity on the CKAN and Postgresql side can be detected. both components take a big percentage of the CPU during the bootup (~20 minutes). What is supposed to be happening behind the scenes? Solr reindexing everytime CKAN restarts?

- Is there any possible action to be done in order to reduce the booting time of a restarted CKAN instance? (DB/Solr/ckan conf.)

Thanks in advance, any help is appreciated,

-- 
Alex Corbi
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.okfn.org/pipermail/ckan-dev/attachments/20150407/ada52572/attachment-0003.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 490 bytes
Desc: Message signed with OpenPGP using AMPGpg
URL: <http://lists.okfn.org/pipermail/ckan-dev/attachments/20150407/ada52572/attachment-0003.sig>


More information about the ckan-dev mailing list