[ckan-dev] performance, caching and authz.

Rufus Pollock rufus.pollock at okfn.org
Tue Jun 14 09:58:44 UTC 2011


On 14 June 2011 10:14, David Raznick <kindly at gmail.com> wrote:
> Hello All
>
> I have been steadily been looking at performance in ckan and I now have a
> decent sense of what areas need to be looked at.

This is brilliant David :-)

> My testing has mainly been around synchronous calls based around requests
> that have been coming into ckan.net.  I have taken a sample of 1000 of them.
> For the time being I have excluded api requests as they are in essence
> easily cachable.   I have only taken request from behind the proxy cache,
> this is in order to get a better spread of possible requests.  I want to
> improve performance in the worst case.
>
> Normal run.  No cache.
> 394 seconds

So this is approx 0.4s per request? What do you think we should aim for?

> I could not get any setting with the current cache options to run
> significantly faster even using 'memory' cache.  There was clearly not too
> many repeated requests in the sample.
>
> In memory cache.
> 382 seconds
>
> The biggest amount of queries and processing comes from authz.  So I decided
> to remove them to see what would happen.
>
> removing authz only
> 252 seconds
>
> Genshi is not that fast.  I removed any rendering to see what that meant. Se
> we return nothing but still do all the background processing.
>
> no rendering and no template level authz
> 190 seconds
> no rendering and no authz
> 161 seconds

[...]

> These times will be increased by a fair chunk if you reintroduce authz.

Very interesting.

> Suggestions...
>
> There are some very slow queries (and too many repeated ones) around, which
> would be good to speed up, however the problem currently is with our fairly
> dynamic pages and our very flexible authz system.

Yes.

> It is very difficult to do decent cache invalidation due to this.  I have
> been trying to think of a decent way, but anything I come up has ended up
> overly complicated and not great to maintain. I also think we should aim for
> a system that does not manually have to be tuned for certain deployments and
> just works and works fast.

Understood. As you say this would also add substantial complexity.

> I have come up with possible ideas that could solve this.
>
> 1. Do not put any dynamic content in our template for content that varies
> from user to user.   We could add all this content with ajax later (this
> would mainly consist of authorization stuff).  This would mean pages would
> be much more cachable and we could even prefill our cache or even have
> static files that we generate.

Great idea. We already do this basic approach for the 'my account /
login / register' buttons. Let's systematize this and make it a policy
:-)

> 2. Have a special cache just for 'visitors'.  We could catch them low down
> the middleware stack and serve what we have cached.  The same pre-filling
> applies to the above.

I like it.

I also wonder if there is a bit of simplification of code. I know, for
example, that we look up (or used to look up!) the user obj in about 3
places in the code for each request (e.g. we pass username into authz
system only to look the user obj up again even though we look up user
obj in BaseController __before__ method).

> This is just for info, I will make it into a crep if we decide to go down
> one of these paths.  Feedback and any other views I would be very interested
> in.

IMO this is just ticketing stuff as opposed to needing a full CREP (in
that its not some big strategic thing and it would be fine for someone
e.g. you :-) to just go ahead and implement).

Rufus




More information about the ckan-dev mailing list