[ckan-dev] tests progress

Seb Bacon seb.bacon at gmail.com
Wed Dec 22 17:35:02 UTC 2010


Hi,

On 22 December 2010 16:37, Rufus Pollock <rufus.pollock at okfn.org> wrote:
> Hi Seb,
>
> Great work! I'm sorry this has proved a bit painful but I'm sure we
> will benefit massively from it.
>
> On 22 December 2010 16:04, Seb Bacon <seb.bacon at gmail.com> wrote:
> [...]
>> Looking at the tests profiles, it was clear that 99% of the time was
>> spent in database I/O (including the overhead of setting up
>> connections), so the biggest win appeared to be moving the tests to a
>> SQLite in-memory database.
>
> I also think for postgres, it may be much cheaper to do mass delete
> than to drop/create table (but that is another matter).

Yes, I have a note to do that for the postgres case separately.

>> In order to do this, I quickly found out I needed to upgrade the
>> venerable SQLAlchemy packaged with CKAN, from 0.4.8 to 0.6.5.
>
> So to confirm: sqlite will not work with sqlalchemy 0.4.8 (what about
> 0.5?). If so we should up requirement in setup.py ...

Right, though hopefully postgres will work with old and new versions.
So do we definitely want to introduce a requirement for sqlite (for
running tests quickly) or not?

I had been assuming we'll just remove the <0.4.99 requirement for
sqlalchemy from setup.py and leave a note in the docs for people who
want to use sqlite.

>> So the task became two parts: upgrade SQLAlchemy, and get the tests
>> running against a SQLite in-memory database.
>>
>> Rufus sensibly cautioned that these would be best attempted
>> separately; however, despite my best efforts, this turned out to be
>> practically impossible.  The SQLAlchemy upgrade broke a number of
>> tests, and the run-tests fix-tests cycle was just too painful.  So the
>> approach I've taken is: fix the tests with the sqlite in-memory
>> database, *then* try running and fixing them against postgres and the
>> upgraded SQLAlchemy.
>
> Understood.
>
>> It's proved much harder than I expected, partly because of the sheer
>> number of tests that need to be modified in minor ways, and partly
>> because of a couple of particularly gnarly bugs.  The work still isn't
>> finished but this is an update on progress.
>>
>> Currently, the code lives in a branch at
>>
>>   https://bitbucket.org/sebbacon/ckan/src/9da772927de6
>>
>> To try it, you will need to alter your test.ini:
>>
>>  [app:main]
>>  sqlalchemy.url = sqlite:///
>>
>> You'll also need to apply sqlalchemy.patch (in the root of the
>> software checkout) to the installed sqlalchemy.
>
> Hmmm. That's nasty. Is there any way round that. Especially if we want
> to deploy on production instances we'll need to work out a fix for
> that.

I've not got round to checking but I think it's a bug in sqlalchemy,
so the fix is to submit the patch (plus test) to sqlalchemy.  Until
then I guess we monkeypatch.

>> Summary of changes
>> ------------------
>>
>> 1) Assumption of persistent fixtures
>>
>> In many cases, the tests only work because in previous tests, various
>> fixtures had been set up.  Obviously, when running against an
>> in-memory database, this becomes a problem.  It's a problem anyway if
>> you want to be able to run individual tests, which we should be able
>> to do if we're to make running tests easier.
>
> That's a bug, so good we have found and fixed these.
>
>> Generally speaking, the fix has been to add "model.repo.init_db()" or
>> "CreateTestData" calls to test setup methods.
>
> Great.
>
>> 2) Skip postgres-specific functionality
>>
>> Postgres-specific functionality that won't work in sqlite includes:
>>
>>  - text indexing / searching
>
> searching we are refactoring right now and i think you are right this
> should move out of core (certainly for the full-text stuff)
>
>>  - regex matching (used in alphabetic batching)
>
> Alphabetic batching should be removed. We could do this as part of
> this work (or probably better: disable tests for time being and once
> sqlite/sqlalchey upgrade is done we remove, or (possibly even better):
> remove these now in default and then merge into your branch)
>
>>  - setting up indexes
>
> Those should be changed to be generic. Should be trivial: just use
> sqlalchemy's create index ...

It was in a migration... will have a look.

>> For now, tests relating to the first two above are conditionally
>> skipped, and the indexes are simply not added for sqlite.  The
>> searching should probably be better implemented in future via
>> pluggable searches including a null search plugin, perhaps?
>
> Agree with both.
>
> [...]
>
>> 4) Crazy non-deterministic revision-based errors
>>
>> I lost lots of time on this.  I found that running tests involving new
>> revisions multiple times sometimes failed, sometimes didn't.  I
>> eventually deduced that this would sometimes fail:
>>
>>  group = model.Group()
>>  model.Session.add(group)
>>  group.name = "b0rked"
>>  model.repo.new_revision()
>>  model.Session.flush()
>>
>> But this would never fail:
>>
>>  model.repo.new_revision()
>>  group = model.Group()
>>  model.Session.add(group)
>>  group.name = "b0rked"
>>  model.Session.flush()
>>
>> It turned out that the non-deterministic failures were related to
>> changes in sqlalchemy's topological sort, which is called to decide in
>> which order to execute outstanding actions on a flush.
>>
>> The current (0.6.5) implementation or sorting ultimately relies on
>> both dictionary values and sets, which are of course unordered
>> collections (see file topological.py, which relies on sets to order;
>> also, the variable "allitems" is ultimately derived from dictionary
>> values, which are also unordered).  So I assume this is a bug -- it
>> would certainly explain the non-deterministic nature of the errors.
>>
>> I'd wasted enough time on this already without trying to work out the
>> correct fix in sqlalchemy, so I just changed the tests so that any
>> call to new_revision() happened *before* an changes to an object.
>> This seemed to sort out any ambiguity in terms of ordering (as the
>> changes set up by new_revision() where always flushed by the time any
>> other changes happened).
>
> OK, sounds very sensible. I'm sorry for your pain here :)
>
>> However, this feels brittle: is it by design that the order of setting
>> up a new_revision and then changing an object previously didn't
>> matter?  Were we previously relying on SQLAlchemy to sort out the
>> right dependency tree for us, and if so on what basis?  I am a little
>> hazy about this.
>
> I should say I'm planning a big overhaul of vdm (to be simpler). So i
> think we live with this 'brittle' ordering requirement for the time
> being.

OK, good to have that confirmed.   The magick in the way it's designed
to be as transparent as possible makes it very hard for this newbie to
get to grips with -- so, simple will be good :)

>> 5) Form validation and database integrity
>>
>> With a SQLAlchemy upgrade, tests which checked that users aren't
>> allowed to set empty fields (e.g. package names) started throwing
>> IntegrityErrors, because the underlying column (i.e. 'name') had
>> unique=True in their table definitions (and the tests were resetting
>> names to blank multiple times).
>>
>> The sqla upgrade seems to allow IntegrityErrors to bubble up earlier than
>> previously -- I didn't get round to working out why.
>>
>> *Should* name-uniqueness be enforced at the database level?
>
> I'm easy either way here. Happy to disable this but it will probably
> require an upgrade script to remove the requirement.

Right

>
>> 6) Other random errors
>>
>> I'm getting there and the tests all run in about 6 minutes; however,
>> there's a few tests that still need fixing and examining.  If you want
>> to have a look, they are basically all the tests with XXX near them in
>> the changes since the start of my branch.  Most of them are in
>> ckan/tests/functional/test_package.py
>
> OK. I'll see what I can do.

Have just pushed some further updates to this.

The current huge mystery for me is described in the comments around here:
https://bitbucket.org/sebbacon/ckan/src/d6ccc763d66b/ckan/tests/functional/test_package.py#cl-1407

when accessing self.anna.extras, we get a "InvalidRequestError: stale
association proxy, parent object has gone out of scope".

As the comments say, right now, I'm unclear even how the "extras"
attribute is provided to a Package.  The error clearly says it's an
association proxy, but if I grep through the code I can't find where
it's set, anywhere.

I also don't understand the error message anyway; does this mean that
self.anna has gone out of scope?  Is that because nothing's
referencing it and it's been GCd, or what?

Thanks,

Seb




More information about the ckan-dev mailing list