[ckan-dev] Bootup performance

Steven De Costa steven.decosta at linkdigital.com.au
Tue Apr 7 10:38:13 UTC 2015


Hi Alex,

I don't have the right direct experience to answer this question too
technically, but can share some info that might help :)

When we build a web application environment we try to separate different
application roles by putting them on different infrastructure instances. We
do this for CKAN too.

So using AWS as the example we'd have an instance for the DB, one for Solr
and one or more for the Python web application (CKAN). For persistent
storage of web application files we'd have an EBS mounted to the DB
instance.

If using docker then you could build the same segregated server roles on
one 'instance', albeit with some overhead. You'd then be able to tune the
server startup and applications/DB separately. For example, you could
script the startup of the DB before the CKAN role to ensure it is up and
ready.

All this also allows you to closely monitor performance as different things
happen within the environment.

Having said all that, someone on the list might have a specific response
for you that means you can get the boot up performance you're after with
the setup you are already working with :)

Cheers,
Steven

On Tuesday, April 7, 2015, Alex Corbi <a.corbi at gmail.com> wrote:

> Hi Everyone,
>
> I have a performance question depending on amount of stored datasets and
> bootup times.
>
> In the context of http://data.opendevelopmentmekong.net/, which is an
> instance based on CKAN v2.2.1 deployed through uwsgi on a server with 2core
> and 4GB of memory (using docker containers), currently with VERY low
> traffic and 13 datasets hosted. In this scneario, bootup times after
> restart of the docker container for CKAN are quick and does not present any
> issue.
>
> However, the bootup time and derivated Issues increase considerably with
> the number of datasets. On the very same setup, but being populated with
> ~2000 datasets, the CKAN instance takes up to 20 minutes to boot and
> sometimes shows an erratic behaviour after rebooting ( Internal Server
> Errors, random URLs and resources not being loaded).
>
> So, here my questions:
> - Do the characteristics of the described system (number of datasets,
> traffic) comply with the "small to medium" instance type mentioned on
> https://github.com/ckan/ckan/wiki/Hardware-Requirements? Are 2 core/4GB
> mem ok?
>
> - During the bootup process, activity on the CKAN and Postgresql side can
> be detected. both components take a big percentage of the CPU during the
> bootup (~20 minutes). What is supposed to be happening behind the scenes?
> Solr reindexing everytime CKAN restarts?
>
> - Is there any possible action to be done in order to reduce the booting
> time of a restarted CKAN instance? (DB/Solr/ckan conf.)
>
> Thanks in advance, any help is appreciated,
>
> --
> Alex Corbi
>


-- 
*STEVEN DE COSTA *|
*EXECUTIVE DIRECTOR*www.linkdigital.com.au
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.okfn.org/pipermail/ckan-dev/attachments/20150407/1f2b58be/attachment-0003.html>


More information about the ckan-dev mailing list