[ckan-dev] Bootup performance

Alex Corbi a.corbi at gmail.com
Thu Apr 9 10:39:42 UTC 2015


Hi Denis,

Thanks for your message.

We have been doing lots of tests and actually are on the way of deployingan instance based on version 2.3, which does not present the problem I am describing. repo here https://github.com/OpenDevelopmentMekong/ckan-2.3

My suspicion is actually that this change https://github.com/OpenDevelopmentMekong/ckan/commit/1791fe2dc7a65cdd819862ebc2ef7a9309469ebe could be the one making CKAN take so long at bootup. Other changes we did on the vanilla CKAN 2.2.1 (see commit list) are not so potentially risky.

Anyone has experience with the consequences of re-activating legacy templates that way in order to reenable the ckan-admin/trash feature on 2.2.1? 

Best,

-- 
Alex Corbi

Am 9. April 2015 bei 12:24:35, ckan-dev-request at lists.okfn.org (ckan-dev-request at lists.okfn.org) schrieb:


Message: 1  
Date: Wed, 8 Apr 2015 10:52:26 -0400  
From: Denis Zgonjanin <deniszgonjanin at gmail.com>  
To: CKAN Development Discussions <ckan-dev at lists.okfn.org>  
Subject: Re: [ckan-dev] Bootup performance  
Message-ID:  
<CAGSzoiOvEaJ6D7xH1akejgzyf0pVzXK1k2fo7n-H2b-tfDUucg at mail.gmail.com>  
Content-Type: text/plain; charset="utf-8"  

Hi Alex,  

I don't see anything too funny in your fork of CKAN, and it runs fine for  
me, though I didn't quite load a thousand datasets into it.  

I suspect it may be an orchestration problem. Can you describe a bit about  
how you are using Docker to run and deploy this? Specifically, Dockerfiles  
and any associated scripts that run on startup would help.  

- Denis  



On Tue, Apr 7, 2015 at 1:02 PM, Alex Corbi <a.corbi at gmail.com> wrote:  

> Hi Ian,  
>  
> I would like to add to my previous email that we are running a customized  
> clone of CKAN, with ONLY following changes:  
> https://github.com/OpenDevelopmentMekong/ckan/commits/master  
>  
> Could you please review them and tell me if you see something strange?  
> Maybe this here has something to do with the Issue:  
> https://github.com/OpenDevelopmentMekong/ckan/commit/1791fe2dc7a65cdd819862ebc2ef7a9309469ebe  
>  
> Anyway, we are going to deploy tomorrow our code based on a vanilla CKAN  
> 2.3 and see if the Issue is still there.  
>  
> Today, we have tried disabling auto_commit on Solr (as specified on  
> https://github.com/ckan/ckan/wiki/Performance-tips-for-large-imports#solr)  
> and our own theme (  
> http://github.com/OpenDevelopmentMekong/ckanext-odm_theme) with same  
> results = Extreme high CPU and I/O load between CKAN and Postgresql  
> containers on startup.  
>  
> Thanks in advance for the support, as always!  
>  
> --  
> Alex Corbi  
>  
> Am 7. April 2015 bei 14:13:08, Alex Corbi (a.corbi at gmail.com) schrieb:  
>  
> Hi Ian,  
>  
> > Web start up time should not depend on the number of datasets, and should be measured in second> s not minutes.  
>  
> OK, this is key information. Because the Issue we are having is definitelly depending on the number of datasets stored.  
>  
> > Are you running anything else during start up?  
>  
> Sometimes we have seen a need for restarting the SOLr container as well (running docker stop solr; docker stop ckan; docker start solr; docker start ckan; ). AS mentioned, we have the components of the architecture ( porstgresql, solr, ckan) separated in different Docker containers. What do you feel in general about using docker for deploying CKAN?  
>  
> > Have you tried disabling plugins in your ini file? Have you made any changes to > ckan?  
>  
> Here is the list of plugins that we currently have enabled on the production.ini file:  
>  
>  
> ckan.plugins = stats text_preview recline_preview pdf_preview datastore datapusher resource_proxy multilingual_dataset multilingual_group multilingual_tag odm_theme pages googleanalytics geojson_preview wms_preview  
>  
> Being odm_theme, our own developed theme for UI customization and adding some logic, you can browse the code here: http://github.com/OpenDevelopmentMekong/ckanext-odm_theme Do you see something weird on the implementation?  
>  
>  
>  
> Ian  
>  
>  
> --  
> Alex Corbi  
>  
> Am 7. April 2015 bei 12:46:21, Alex Corbi (a.corbi at gmail.com) schrieb:  
>  
> Hi Steve,  
>  
> Thanks for you answer. Our current docker setup segregates indeed the  
> different instances in different containers ( 1x postgresql, 1x ckan, 1x  
> solr).  
>  
> In order to compare? could you please tell me a bit about your CKAN  
> instance:  
> - How many datasets are currently hosted?  
> - What are the specs of the machines where CKAN runs?  
> - How long does it take aprox. for the CKAN instance to boot after a  
> reset or shutdown ?  
>  
> --  
> Alex Corbi  
>  
> Am 7. April 2015 bei 12:14:14, Alex Corbi (a.corbi at gmail.com) schrieb:  
>  
> Hi Everyone,  
>  
> I have a performance question depending on amount of stored datasets and  
> bootup times.  
>  
> In the context of http://data.opendevelopmentmekong.net/, which is an  
> instance based on CKAN v2.2.1 deployed through uwsgi on a server with 2core  
> and 4GB of memory (using docker containers), currently with VERY low  
> traffic and 13 datasets hosted. In this scneario, bootup times after  
> restart of the docker container for CKAN are quick and does not present any  
> issue.  
>  
> However, the bootup time and derivated Issues increase considerably with  
> the number of datasets. On the very same setup, but being populated with  
> ~2000 datasets, the CKAN instance takes up to 20 minutes to boot and  
> sometimes shows an erratic behaviour after rebooting ( Internal Server  
> Errors, random URLs and resources not being loaded).  
>  
> So, here my questions:  
> - Do the characteristics of the described system (number of datasets,  
> traffic) comply with the "small to medium" instance type mentioned on  
> https://github.com/ckan/ckan/wiki/Hardware-Requirements? Are 2 core/4GB  
> mem ok?  
>  
> - During the bootup process, activity on the CKAN and Postgresql side can  
> be detected. both components take a big percentage of the CPU during the  
> bootup (~20 minutes). What is supposed to be happening behind the scenes?  
> Solr reindexing everytime CKAN restarts?  
>  
> - Is there any possible action to be done in order to reduce the booting  
> time of a restarted CKAN instance? (DB/Solr/ckan conf.)  
>  
> Thanks in advance, any help is appreciated,  
>  
> --  
> Alex Corbi  
> ------------------------------  
>  
> ------------------------------  
>  
> ------------------------------  
>  
>  
> _______________________________________________  
> ckan-dev mailing list  
> ckan-dev at lists.okfn.org  
> https://lists.okfn.org/mailman/listinfo/ckan-dev  
> Unsubscribe: https://lists.okfn.org/mailman/options/ckan-dev  
>  
>  
-------------- next part --------------  
An HTML attachment was scrubbed...  
URL: <http://lists.okfn.org/pipermail/ckan-dev/attachments/20150408/638b7628/attachment-0001.html>  

------------------------------  

Message: 2  
Date: Wed, 8 Apr 2015 14:23:03 -0400  
From: St?phane Guidoin <stephane at opennorth.ca>  
To: CKAN Development Discussions <ckan-dev at lists.okfn.org>  
Subject: Re: [ckan-dev] Spatial, Harvest and DCAT extensions up to  
speed with CKAN 2.3  
Message-ID:  
<CALoW1wDUx7ptX8QPidWHOyp0dEe1VkXoNxf-tsc64h3dPHh=HQ at mail.gmail.com>  
Content-Type: text/plain; charset="utf-8"  

Very good news! Thank you Adria.  

In a previous mail, you said that some budget was planned to work on  
ckanext-dcat, where, among other thing, the generation of the dcat is an  
important missing piece. Do you know how far it is in your priority list?  

St?phane  

___  



* <http://www.opendatasummit.ca/>*  
*Register now  
<http://www.eventbrite.ca/e/canadian-open-data-summit-2015-sommet-canadien-des-donnees-ouvertes-2015-tickets-15458440612>  
*for  
the Canadian Open Data Summit!  
*Inscrivez-vous*  
<http://www.eventbrite.ca/e/canadian-open-data-summit-2015-sommet-canadien-des-donnees-ouvertes-2015-tickets-15458440612>  
au Sommet Canadien des Donn?es Ouvertes!  

On Wed, Apr 8, 2015 at 5:41 AM, Adri? Mercader <adria.mercader at okfn.org>  
wrote:  

> Hi all,  
>  
> Just a quick note to let people know that ckanext-spatial,  
> ckanext-harvest and ckanext-dcat have been updated to run with CKAN  
> 2.3 and above.  
>  
> In the ckanext-spatial case, changes in the SQLAlchemy version on CKAN  
> core meant that we had to upgrade the GeoAlchemy requirement to  
> GeoAlchemy2. New deployments of ckanext-spatial will take care of  
> this, but if you are upgrading an existing install you will need to  
> install it manually (it is a separate package):  
>  
> pip install geolachemy2  
>  
> For more details check the Troubleshooting section of the docs:  
>  
>  
> http://docs.ckan.org/projects/ckanext-spatial/en/latest/install.html#when-upgrading-the-extension-to-a-newer-version  
>  
> This changes also mean that PostGIS 2.x is fully supported. The  
> install isntructions have been updated to reflect this as well.  
>  
> Thanks to Sol Lee for providing a patch to upgrade the GeoJSON preview.  
>  
> Let us know if you find any issue.  
>  
> Adri?  
> _______________________________________________  
> ckan-dev mailing list  
> ckan-dev at lists.okfn.org  
> https://lists.okfn.org/mailman/listinfo/ckan-dev  
> Unsubscribe: https://lists.okfn.org/mailman/options/ckan-dev  
>  
-------------- next part --------------  
An HTML attachment was scrubbed...  
URL: <http://lists.okfn.org/pipermail/ckan-dev/attachments/20150408/23543086/attachment-0001.html>  

------------------------------  

Message: 3  
Date: Thu, 9 Apr 2015 11:24:29 +0100  
From: Adri? Mercader <adria.mercader at okfn.org>  
To: CKAN Development Discussions <ckan-dev at lists.okfn.org>  
Subject: Re: [ckan-dev] Spatial, Harvest and DCAT extensions up to  
speed with CKAN 2.3  
Message-ID:  
<CAGJR8iJzWas4H3OBQjxFyW7bH91CXNHG+DZ6MHp4qk0-KvOp2w at mail.gmail.com>  
Content-Type: text/plain; charset="utf-8"  

Hi St?phane,  

On 8 April 2015 at 19:23, St?phane Guidoin <stephane at opennorth.ca> wrote:  

> In a previous mail, you said that some budget was planned to work on  
> ckanext-dcat, where, among other thing, the generation of the dcat is an  
> important missing piece. Do you know how far it is in your priority list?  


If I'm not mistaken work on this should start in a couple of weeks. It is  
part of a wider set of requisites for a project, but I'm pretty sure DCAT  
generation should be part of it, at least starting to spec out how it'd  
work.  


Adri?  
-------------- next part --------------  
An HTML attachment was scrubbed...  
URL: <http://lists.okfn.org/pipermail/ckan-dev/attachments/20150409/5529ae2d/attachment.html>  

------------------------------  

Subject: Digest Footer  

_______________________________________________  
ckan-dev mailing list  
ckan-dev at lists.okfn.org  
https://lists.okfn.org/mailman/listinfo/ckan-dev  
Unsubscribe: https://lists.okfn.org/mailman/options/ckan-dev  


------------------------------  

End of ckan-dev Digest, Vol 54, Issue 13  
****************************************  
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.okfn.org/pipermail/ckan-dev/attachments/20150409/cd7e04c3/attachment-0003.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 490 bytes
Desc: Message signed with OpenPGP using AMPGpg
URL: <http://lists.okfn.org/pipermail/ckan-dev/attachments/20150409/cd7e04c3/attachment-0003.sig>


More information about the ckan-dev mailing list