[ckan-dev] CKAN performance

David Raznick david.raznick at okfn.org
Mon Feb 25 20:25:49 UTC 2013


Hello David

Am not free tomorrow have family things on.

However, I can give you a few things you could look at.  CCed CKAN dev for
interest sake.

**Uploading performance*

A big factor in this is that we do solr commit every single dataset.  For
some instancies we have stopped this and just do a commit every 30 seconds
or so. In solr 4 there is a config option for this so you do need an
external cron to do it (and a soft commit option which can be done about
every second)

We where doing a lot of dataset (500k) for geodatagov and it got very slow
after a while the following helped a lot.

Stopping some db contraints:
https://github.com/okfn/ckanext-geodatagov/blob/master/constraints.sql

and changing the following indexes:
https://github.com/okfn/ckanext-geodatagov/blob/master/what_to_alter.sql

These a are not in CKAN master yet but we will be adding some soon.

**Read Performance*

As we store the whole package_dict in the search index it is best to use
that where you can  make things a lot faster.  In 2.0 for tag_show and
group_show, instead of package_dictizing every associated dataset we just
did a search query instead.

Genshi is much slower than Jinja. The old auth model queries are slower too.

When searching even if you getting stuff out of the search index we are
still checking the existence of each dataset.  This could be sped up by
doing a bulk check of all packages returned or just trusting the search
index to be in sync.

That is all I can think of for now.

Good luck

David
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.okfn.org/pipermail/ckan-dev/attachments/20130225/93e247af/attachment.html>


More information about the ckan-dev mailing list