[ckan-discuss] Test speed, sqlite and sqlalchemy

Rufus Pollock rufus.pollock at okfn.org
Tue Dec 28 10:35:50 GMT 2010


On 26 December 2010 22:57, David Raznick <kindly at gmail.com> wrote:
> I am new to this project, so Hello.  I am very interested in the concept
> behind it and decided to see what my expertise could bring to it.  I know
> database performance and sqlalchemy very well. I have even helped out a
> little with the performance of sqlalchemy.

Wow, David, this is fantastic! BTW: would you like to be added to our
contributors list?

> The first thing I noticed were the slow tests. I wanted to look the tests to
> learn how the project worked.  I was also concerned about the old sqlalchemy
> (0.4). So I decided to see what I could do...
> I have managed to get the tests down to running in about 9 mins on my oldish
> laptop.
> This was done by:
>  * a patch I submitted on bug 868, which stopped the continuing dropping and
> creating of tables as this has a large overhead.  Just simply deleting
> everything in all the tables is much faster.
>  (http://knowledgeforge.net/ckan/trac/ticket/868)

Great, I'd wondered about this myself and it's great you've done it --
and very nicely with the table iterators -- that's a trick I'll learn
from :)

>  * turning off durability in postgres.  You do not need durability in
> testing.  I even tried putting the database in a ramfs but I decided the
> complication was not worth it, the speed up was not significant.
> (http://www.postgresql.org/docs/9.0/static/non-durability.html) works for
> 8.x too.

Another great tip.

>  * upgrading to  sqlalchemy 0.5.7  (this gives a good 15-20% speed up).
>  * sorting out some really slow bits in the tests.  There where almost 3
> mins of unessential sleeping/timeouts.

Again, great!

> The trickiest bit was the upgrade to 0.5.  I have all the tests working
> apart from 2 and I understand why they are not working.  They are not
> critical and do not effect the speed.  This is for another discussion.

Great.

> The functional subdirectory in the tests is by far the slowest (as it should
> be).  Taking up 6.5 mins of the 9.
> A speed per test sheet has been posted on ticket 868.
> There has been some discussion on ticket 867
> (http://knowledgeforge.net/ckan/trac/ticket/876) about using and in memory
> sqlite as a backend.  In those I am clearly against it.  I think the
> complication it adds to the code base is not worth it.  I also do not think
> that it will speed things up as much as hoped ...

I definitely agree with you about complication. It would be 'nice' to
support sqlite but no-one is going to run against it in production and
our main reason for doing so was test speed. I'd be happy for us to
drop ticket #876 (perhaps using the efforts done so far).

@Seb (as the owner of ticket 876): what do you think here?

> I have done some profiling on the whole test suite.  Only about 110 seconds
> of the 9 mins is spent waiting for the database.  So even if you managed to
> get rid of all of that time (which you will not as in memory databases are
> not instantaneous) you may only get a speed up of a third.

Right.

> Around 50% of the time is spent in sqlalchemy.  So the bottleneck is there.
>  I think upgrading to 0.6 may be as fruitful.  sqlalchemy is going under
> some big speed improvements
> (http://techspot.zzzeek.org/2010/12/12/a-tale-of-three-profiles/).  There

Ok, so 0.5 -> 0.6 looks like about a 20% reduction while 0.5 -> 0.7
looks like a 50% reduction (i.e. doubling in performance)!

> may be some benefit from not going through the orm  part for some
> performance critical areas of the code.  However, we will need to analyse
> real user database traffic to determine this.
> As a side note, the really good news is that hardly any time is spent in
> ckan code itself.
> I hope you all had a good seasonal period.
> David
> Please get back to me on any details on the above!

We'll look through in more detail but let me reiterate the massive
thank-you for your work and advice here -- really useful and really
appreciated.

Regards,

Rufus



More information about the ckan-discuss mailing list