[ckan-dev] [ckan-discuss] Test speed, sqlite and sqlalchemy

Thu Dec 30 22:06:09 UTC 2010

On Thu, Dec 30, 2010 at 2:09 PM, Seb Bacon <seb.bacon at gmail.com> wrote:

> Hi,
>
> Again with reference to below, have pushed further changes to
> https://bitbucket.org/sebbacon/ckan.
> Have cc'd ckan-dev for general dev interest and/or feedback!
>
> On 30 December 2010 03:02, David Raznick <kindly at gmail.com> wrote:
> > Hello
> > I am emailing off the list as I think it would be too detailed for it.
> >
> >>
> >>  I believe that David's objection to supporting sqlite as an
> >> option is that it "unnecessarily adds complication to production code"
> >> (David, is that right?), but I don't believe it does. I had to change
> >> quite a lot of the *tests* because they assumed persistent fixtures,
> >> but none of the main codebase.
> >
> >
> > You could be right,  I expected more conversion errors for date, unicode
> and numeric types.  If you can get round these by changing the tests like
> harvesting unicode conversions then great. The extra speed may be worth it.
>
> FYI I've seen no errors relating to date or numeric types at all.


> >> With reference to the below, I've pushed all the work I've done so far
> >> to https://bitbucket.org/sebbacon/ckan.
> >
> >
> > I have done a little code review, as we have covered similar ground.   I
> have used a codereview tool.  Its a nice little app to stop the email from
> getting too much code in it.  The main link is
> http://codereview.appspot.com/3772048/ , however I will post the
> individual links.
>
> That tool is very nice.

Its actually a version of the one Guido uses in google.  Its got a very nice
script which works for hg and git.  For the above all you have to do is run

python upload.py --rev=c892749f74f0:f061979c7d46.    It has a bug though, it
does not deal with deleted files and that's why I made the diff the wrong
way.



> > **WARNING the diffs are in the wrong direction, the changes are on the
> left or are marked "-" in diffs.  Also click on the comments to see them
> properly**
> > I will just do a list of points vaguely in order of importance.  I
> understand you are mid work on this but thought it might be some help.
>
> Indeed -- two brains are much better than one :)  Especially as you
> are clearly far better acquainted with sqlalchemy than me.  Shame we
> weren't collaborating more earlier, but at least we know for next
> time.
>
> >
> > * I seriously think that we should *not* change the models.  I think the
> unique constraints should stay. If we do remove them we will need a
> migration script.  Removing them causes lots of tests to fail.  e.g
> http://codereview.appspot.com/3772048/patch/1/9
>
> Doesn't cause any tests to fail for me... can you specify?  I had
> already created a migration script which I'd forgotten to commit (now
> in there).
>
The errors only occur when there is a unique constraint on the columns. i.e
I get unique constraint errors.

It seems a shame to need any migration at all.  On the other thread people
did not seem too worried about it.  However, this could complicate (at leats
slow)  upgrading any production databases out there.  Maybe it will be fine,
but it feels me with unease.  Also it makes sense for the tags table, for
example, to keep the unique constraint.

>
> > *
> http://codereview.appspot.com/3772048/diff/1/ckan/model/domain_object.py.
>  Postgres is slow at doing anything to do with table creation, dropping
> and inspecting.  The profiler says the  by_name function is run over 20,000
> times during the tests.  This causes at least a 100 seconds slowdown in
> postgres.  The original code did not need to be changed.   As a side note,
> the calls to init_db also slow postgres down due to the repeated
> metadata.create_all(engine) calls.  Both these cause some of
> the discrepancy between our postgres timings.  My posgres takes about 15
> mins with your branch.
>
> OK, have reverted that change.  It was one of the earlier changes I
> introduced to fix test failures -- I imagine it was one that has
> subsequently been fixed by overriding the autoflush madness.  Either
> way, the tests now pass without it.
>
> Re init_db, I am not sure what the alternative to calling it in the
> setup of most tests, if we want tests to run independently of each
> other.
>

You could check if metadata has the tables already loaded (by checking
metadata.sorted_tables or metadata.table_iterator).  The metadata (and
model) is global and persists between tests. There is no way to run
tests independently anyway because of this.

>
> > * The new indexing stuff does not appear to work for me yet. I think you
> just have not added the new index.txt file.
> http://codereview.appspot.com/3772048/diff/1/ckan/model/__init__.py
>
> Yes, this was unfinished.  The reason I paused here was because I was
> getting a very odd "index already exists" error (odd in that it
> appeared to be an exception thrown from the "self.metadata.create_all"
> call in create_db).  I just rewrote it a little more cleanly and
> uncommented the sample index in indexes.txt; try rerunning the tests
> and let me know if it's obvious to you why it's failing.
>
> Incidentally, I believe that the vast majority of the indexes that
> were being created by 021_postres_upgrade.sql were redundant as they
> were primary or foreign keys, so they should already *be* indexed.
>
> > * The patch I sent was supposed to use delete not drop.  See comments in
> http://codereview.appspot.com/3772048/diff/1/ckan/model/__init__.py.
>
> The reason I changed this to 'drop' was to accurately replace the
> semantics of the clean_db method we're overriding in vdm's tools.py
> (which does a metadata.drop_all())
>
> On the other hand, I suppose a drop is much slower than a delete all,
> and this is test-specific code, so perhaps I shouldn't worry about
> that.  My concern is that any other tests which might be added in the
> future may depend on the "drop" semantics of vdm's clean_db().
>

I could only see "drop" being useful if someone was testing migrations i.e
schema changes in the tests.  Doing such things you would explicitly make
sure you would drop and recreate anyway between tests.

>
> > * http://codereview.appspot.com/3772048/patch/1/2.  The attached patch
> authz.patch  fixes your only major error.  The authz query is wrong in new
> versions of sqlalchemy. You are no longer allowed to use the parent
> table model.UserObjectRole in the filter clause as it results in a cross
> join and returns ALL the rolls!.  You have to use UserRole or GroupRole etc.
>  There is a way to get out these classes from the entity you put in as shown
> in the patch.
>
> Great - I already knew about this error as noted in the tests but
> wasn't sure of the right way to fix it.  Thanks!
>
> >
> > *
> http://codereview.appspot.com/3772048/diff/1/ckan/controllers/package.py
> http://codereview.appspot.com/3772048/diff/1/ckan/lib/package_saver.py.
>  I solved the detached from session errors by adding a few
> Session().autoflush = False.   I am not sure your other fixes in these files
> are needed.
> > The new version of sqlalchemy more aggressively autoflush. It does it
> when you try looping through a one to many collection i.e pkg.groups.   I
> personally hate autoflush.  Attached in autoflush.diff are where I put the
> stop autoflushes to stop the errors.
>
> I hate autoflush too :)  It's horribly magical behaviour.
>
> It doesn't fix the need to do the pkg.groups hack, though; without it
> the test at functional.test_package.test_edit_bad_log_message still
> fails.
>
I have not had this error yet in maybe a 0.6 thing.

>
> >
> > * the alphabet paging hack can be done properly in 0.6
> http://codereview.appspot.com/3772048/patch/1/4.  by using
> http://www.sqlalchemy.org/docs/orm/query.html#sqlalchemy.orm.query.Query.column_descriptions
>
> Thanks.  So I've tried a solution based on column_descriptions, but it
> still looks hacky.  You'll see what I mean at
>
> https://bitbucket.org/sebbacon/ckan/changeset/98198a6b0879#chg-ckan/lib/alphabet_paginate.py
>
> Am I still missing a more elegant way of getting the necessary
> entities via the API?
>

There is not I am afraid. The column descriptions was added for that purpose
and at least it is not private like the other ways.

>
> > * Attached is less_polling.diff which means you do not have to wait for a
> whole second each request.
>
> Great.
>
> The only other error left when running the test suite against
> sqlalchemy 0.6.5 (in addition to the duplicate index issue mentioned
> above) is the "stale association proxy" error that is commented in
> functional.test_package.test_calculate_etag_hash.  I have to admit
> that as a sqlalchemy newbie I'm rather baffled by it -- any ideas?
>

I have not got that error.  I have only tested on 0.57 though. Not got any
ideas I am afraid.

>
> Cheers,
>
> Seb
>
A point that I forgot to mention in the last post was the making the
test.ini is a symlink and the original got deleted.
http://codereview.appspot.com/3772048/diff/1/test.ini.    I thought for
development you just changed your personal develoment.ini and the test.ini
was fixed.
Also the pip requirements where changed to your repo.
http://codereview.appspot.com/3772048/patch/1/52.

I have got another cheeky patch.  It causes the call to the fsf website to
fail more often but can speed up the tests by over a minute. There is a
group of tests that test connection failure (TestGeoNetworkClientSiteDown in
test_cswclient).  There are 8 tests and they each connect 2 times, so you
waste 80 seconds if the timeout is 5.   You may want to experiment with
reducing the timeout, probably not as low as 0.5 as in the patch, even
though that worked.

diff -r 7f2239b0f743 ckan/tests/lib/test_cswclient.py

--- a/ckan/tests/lib/test_cswclient.pyFri Dec 17 10:34:47 2010 +0000
+++ b/ckan/tests/lib/test_cswclient.pyMon Dec 27 02:00:10 2010 +0000
@@ -11,7 +11,7 @@
 from mock_cswclient import MockGeoNetworkClient

   import socket
   -socket.setdefaulttimeout(5)
   +socket.setdefaulttimeout(0.5)

     class CswRequestTestCase(CheckMethods):

I have read over the original thread on the test upgrades now (I would have
saved some time by reading it).  I think it is a shame that the sqlite
upgrade is not independent of sqlalchemy one. I was able to to do the
sqlalchemy to 0.57 on its own.  I think it would be too messy to merge back
now though.  I can see your rationale though, tests taking 40mins a pop
would be like wading through treacle.  Anyway, I have gone through all the
changes I made so it may still be possible.

I have not got too much time to work on this any more.  I will review
anything in particular that comes up and of course will answer any
questions.

David
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.okfn.org/pipermail/ckan-dev/attachments/20101230/271d3eda/attachment-0001.html>