[ckan-discuss] Test speed, sqlite and sqlalchemy

Tue Dec 28 13:35:14 GMT 2010

Hi,

Yes, thanks very much for all the input so far.

With reference to the below, I've pushed all the work I've done so far
to https://bitbucket.org/sebbacon/ckan.  It's more or less ready to
go, except there are two failures against sqlalchemy 0.6.5 at the
moment which I'll send a separate email about (possibly not today, as
have guests arriving shortly...!)  The fork passes all tests against
sqlalchemy 0.4.x.

On 28 December 2010 10:35, Rufus Pollock <rufus.pollock at okfn.org> wrote:
> On 26 December 2010 22:57, David Raznick <kindly at gmail.com> wrote:
>> I am new to this project, so Hello.  I am very interested in the concept
>> behind it and decided to see what my expertise could bring to it.  I know
>> database performance and sqlalchemy very well. I have even helped out a
>> little with the performance of sqlalchemy.
>
> Wow, David, this is fantastic! BTW: would you like to be added to our
> contributors list?
>
>> The first thing I noticed were the slow tests. I wanted to look the tests to
>> learn how the project worked.  I was also concerned about the old sqlalchemy
>> (0.4). So I decided to see what I could do...
>> I have managed to get the tests down to running in about 9 mins on my oldish
>> laptop.
>> This was done by:
>>  * a patch I submitted on bug 868, which stopped the continuing dropping and
>> creating of tables as this has a large overhead.  Just simply deleting
>> everything in all the tables is much faster.
>>  (http://knowledgeforge.net/ckan/trac/ticket/868)
>
> Great, I'd wondered about this myself and it's great you've done it --
> and very nicely with the table iterators -- that's a trick I'll learn
> from :)

Yes -- this is integrated in my fork with an ini flag
(test_fast_repo_delete_hack).

>>  * turning off durability in postgres.  You do not need durability in
>> testing.  I even tried putting the database in a ramfs but I decided the
>> complication was not worth it, the speed up was not significant.
>> (http://www.postgresql.org/docs/9.0/static/non-durability.html) works for
>> 8.x too.
>
> Another great tip.

Yes, I didn't know about this -- fantastic.  Will update documentation
accordingly once we're ready to roll out all these changes.

>>  * upgrading to  sqlalchemy 0.5.7  (this gives a good 15-20% speed up).

I hopped straight from 0.4.8 to 0.6.5, and got a 20% speed up overall
-- so it doesn't look like the most recent changes have had that much
benefit for our test suite, at least.

>>  * sorting out some really slow bits in the tests.  There where almost 3
>> mins of unessential sleeping/timeouts.

I've not done this -- it would be great if you could supply patches
(or a fork, or whatever) for these.

>> The trickiest bit was the upgrade to 0.5.  I have all the tests working
>> apart from 2 and I understand why they are not working.  They are not
>> critical and do not effect the speed.  This is for another discussion.

I wonder if they are the same two that I can't get to work.  Will
start another thread for this.

>> The functional subdirectory in the tests is by far the slowest (as it should
>> be).  Taking up 6.5 mins of the 9.
>> A speed per test sheet has been posted on ticket 868.
>> There has been some discussion on ticket 867
>> (http://knowledgeforge.net/ckan/trac/ticket/876) about using and in memory
>> sqlite as a backend.  In those I am clearly against it.  I think the
>> complication it adds to the code base is not worth it.  I also do not think
>> that it will speed things up as much as hoped ...
>
> I definitely agree with you about complication. It would be 'nice' to
> support sqlite but no-one is going to run against it in production and
> our main reason for doing so was test speed. I'd be happy for us to
> drop ticket #876 (perhaps using the efforts done so far).
>
> @Seb (as the owner of ticket 876): what do you think here?

My opinion is that it's up to the developer how they want to run the
tests: give them the option and make them aware of the risks/benefits.
 Being able to offer the *option* of doing it in sqlite is a good
thing for me, at least: while for David the tests run in 9 minutes
with postgres, on my laptop they take twice as long to run (*with* the
suggested durability changes).  By contrast, against sqlite, the tests
run in just over 6 minutes for me.

I suspect the discrepancy is mostly due to the known (but not
understood) issues others have found running the tests on Lucid
(notwithstanding skipped search-related tests for sqlite).  So others
might find less benefit using sqlite than me.

The integration server should certainly always test against postgres,
of course.  The goal is to make the local develop/test *cycle* easier,
because right now many people don't do it.  I would still expect to
run tests against postgres before a push, but not during development
cycles.  I believe that David's objection to supporting sqlite as an
option is that it "unnecessarily adds complication to production code"
(David, is that right?), but I don't believe it does. I had to change
quite a lot of the *tests* because they assumed persistent fixtures,
but none of the main codebase.  All the changes I did make to the main
codebase were related to the sqlalchemy upgrade, which is desirable in
its own right.  The only problem I personally see with the sqlite
backend is that it doesn't include search-related tests, though as
discussed elsewhere these will get factored out soon hopefully anyway.

In short, I think it's absolutely desirable to support sqlite
in-memory as an option for quick test/develop cycles, as long as it's
clear that they're not a replacement for a full postgres test.

Furthermore, as a minor side-effect, it means we're also a step
towards being able to support other backends in the future.

>> I have done some profiling on the whole test suite.  Only about 110 seconds
>> of the 9 mins is spent waiting for the database.  So even if you managed to
>> get rid of all of that time (which you will not as in memory databases are
>> not instantaneous) you may only get a speed up of a third.
>
> Right.

But considering the sqlalchemy upgrade gives a speed up of a fifth,
that's not to be sneezed at :)

>> Around 50% of the time is spent in sqlalchemy.  So the bottleneck is there.
>>  I think upgrading to 0.6 may be as fruitful.  sqlalchemy is going under
>> some big speed improvements
>> (http://techspot.zzzeek.org/2010/12/12/a-tale-of-three-profiles/).  There
>
> Ok, so 0.5 -> 0.6 looks like about a 20% reduction while 0.5 -> 0.7
> looks like a 50% reduction (i.e. doubling in performance)!

Not what I've found in our test suite.  Against my fork, on my laptop:

- the postgres changes give a 5x speedup
- upgrading from sqlite 0.4.8 to 0.6.5 gives a 20% speedup on top of this
- sqlite in-memory gives a three times speedup on top of this

Once again, this could be an artefact of the mysterious Lucid issues.
I'd be interested to get comparative numbers from someone else.

>> may be some benefit from not going through the orm  part for some
>> performance critical areas of the code.  However, we will need to analyse
>> real user database traffic to determine this.

I think this will be worthwhile.  Apart from the issue of getting a
fast-running test suite, I believe there are reported performance
issues with CKAN.  Others know more about this than me.

>> As a side note, the really good news is that hardly any time is spent in
>> ckan code itself.
>> I hope you all had a good seasonal period.
>> David
>> Please get back to me on any details on the above!
>
> We'll look through in more detail but let me reiterate the massive
> thank-you for your work and advice here -- really useful and really
> appreciated.

And from me too!

Seb