[ckan-dev] test failures

Fri Feb 4 13:15:09 UTC 2011

On 4 February 2011 11:21, David Read <david.read at okfn.org> wrote:
> On 4 February 2011 10:40, Seb Bacon <seb.bacon at gmail.com> wrote:
>>
>> Hi,
>>
>> On 4 February 2011 09:04, David Read <david.read at okfn.org> wrote:
>> >
>> >
>> > On 3 February 2011 22:35, David Raznick <kindly at gmail.com> wrote:
>> >>
>> >>
>> >> On Thu, Feb 3, 2011 at 9:50 PM, David Read <david.read at okfn.org> wrote:
>> >>>
>> >>> Hi all,
>> >>>
>> >>> I've changed buildbot to run ckan tests twice now: firstly with sqlite
>> >>> and then again with postgres in the same buildbot script (I should
>> >>> split it
>> >>> into two tasks, but that is for another day). I've fixed a few
>> >>> problems with
>> >>> the tests under sqlite and they pass now, but there are still problems
>> >>> with
>> >>> tests in postgres.
>> >>> Seb changed lots of 'rebuild_db' statements to 'clean_db', which then
>> >>> requires 'init_db' at the start of the next test. So there are now
>> >>> 'init_db'
>> >>> statements at the start of a great many tests. Was there any reason
>> >>> for
>> >>> this?
>>
>> Yes, see below.  My intent was for them to be at the start of *all*
>> tests.  Clearly for whatever reason the fact they weren't was being
>> hidden by another bug.
>>
>> >> This may not be necessary any more with the fix we did the other day on
>> >> migrate.  However, the in memory sqlite database gets lost mysteriously
>> >> sometimes and its hard to figure out where.  That is why I thought the
>> >> init_dbs were there.
>> >
>> > These problems are with postgres in this case, and there is no mystery
>> > about
>> > losing the d.b. - we call clean_db!
>> > Note to Seb: Things were probably working in the past because the
>> > 'faster_db_hacks' config option wasn't being read properly, which David
>> > and
>> > I fixed last Friday, so the full init_db was happening even with
>> > postgres,
>> > whereas it's not now.
>>
>> Ah, I wondered about that, seeing as the tests all passed before.
>> Apologies -- at least now I know about the asbool converter :)  As
>> below, my intent was always that the database *should* be recreated
>> with each test.
>>
>> > Also the init_db parameter 'conditional' (BTW what a
>> > terrible name for a parameter...!) is not being used.
>>
>> It's not the best, no, but let's not start picking on each others'
>> individual variable name errors :)  It's an accidental artefact of an
>> earlier refactoring, just get rid of it...
>
> Ok, but it's used a few times and because the name gives no idea of the
> intention it's hard to know whether the init_db calls that use this param
> can be removed too. Hence my suggestion for a tidy up.
>
>>
>> > I think basically this
>> > all needs a bit of a review / tidy up to get it to make sense and work
>> > too.
>>
>> Either there's something I don't fundamentally get, or we are talking
>> cross purposes.  Let me try to untangle at least my own intent:
>
> Yes, that's what I'm after - thanks.
>
>>
>> >>> It seems a right faff to remember this - is there no better way?
>> >>> Indeed I
>> >>> believe the current test failures are caused by a test not having a
>> >>> init_db.
>>
>> When refactoring the tests, I noticed that some called "clean_db" and
>> "init_db" and others didn't.  This was inconsistent.
>
> Looking at cset:c892749f74f0 (before you made changes) there were three
> calls to clean_db in the tests. In all occasions they are immediately
> followed by a create_db or a rebuild_db. So the tests *were* consistently
> expecting the tables to be created.

To be clear, when I say "some called clean_db and init_db" I included
rebuild_db in my meaning.

I should have said, "some needed to clear and recreate the fixtures,
but others assumed the fixtures already existed".

>> Most of the
>> tests assumed test fixtures had been set up,
>
> I don't believe that is true. All test classes assume the tables are empty.
> All over the place you see setup methods creating test data and teardowns
> deleting it. When the teardown is lazy, they call rebuild_db, which in the
> past was slower, but sure.

I've not got time to go back through all the changesets, but I don't
believe you're right.  There were several test classes that didn't
assume empty tables, but assumed VDM plus a sample User (what init_db
does) had been set up, but didn't have a call to init_db at the start.
 It wasn't always clear from looking at the tests which ones
definitely needed init_db, because so many of them are functional
tests which rely on package and vdm stuff deep in the call tree.

Some tests also set up and tore down their own specific fixtures *in
addition*, indeed.

>> but some had to assume they'd been set up but not reset.
>>
>> This state of affairs is much
>> *more* of a faff when trying to debug problems, than knowing that each
>> test is responsible for setting up the fixtures it needs to run.
>> Previously, some (but not all) of the tests had a dependency on some
>> of the others.
>
> I agree there shouldn't be dependencies between tests. Certainly there have
> been a few test methods which relied on state left by a previous method in
> the class. The ones I've come across I've either decided are a reasonable
> compromise and labelled very clearly, or I've gone to the trouble to correct
> them.

I'm not talking about such cases, which seems reasonable, of course!

> Yes there are a few tests which remain 'at large', and we should focus
> on stamping them out, rather than throwing the baby out with the bathwater
> to delete and recreate the tables with every test class.

I partially agree, though I am not sure what baby is being thrown out
with the bathwater.  At the worst, there are a few init_db() calls
that don't need to be there (and evidently a few others that should be
added in, which were obscured by a bug in the tests).

>> For example there were loads of tests that called "rebuild_db()" in
>> their teardown methods (which just called "clean_db" and then
>> "init_db").  More explicit (and standalone) for them to call init_db
>> in their setup and clean_db in their teardown.  Less faff when trying
>> to debug problems.
>
> So you replaced about 60 rebuild_db calls with clean_db and mandated that
> every single test starts with init_db.

Yes, the former part of the statement is correct.  The latter isn't
quite correct (though admittedly I just carelessly said something to
that effect in the previous email :/ ).  I would say I want to mandate
that "every single test *that needs the data setup by init_db* must
start with init_db".  I am sure I did put in too many init_dbs because
it was a pain trying to understand every one that actually needed it
and I figured it was harmless and could be pruned later.  But the
state before I did that was not harmless because a successful test
suite needed its tests to be run in a particular order.

I'll try explaining again -- we clearly still don't understand each
other (I wouldn't be surprised if I have all this absolutely wrong,
but I want to understand why, if so):

I found many tests couldn't run independently and other tests *did*
pass.  The difference was usually that tests which worked
independently were preceded in the test run by tests that ended with
clean_db *and* init_db.

One perfectly reasonable interpretation of this situation is that
*all* tests should be able to expect a particular database state to
run, and that this state must be reset after each test that changes
the state. I believe this is your interpretation (?)

However, the interpretation I made was that each test should be
responsible for setting up the state that it needs and shouldn't
assume *anything* about the database other than its existence.

By taking that approach, it follows that a tearDown that does setUp
stuff (i.e. rebuild_db, which calls init_db) is doing unnecessary
setting up.  Worse (seeing it from the point of view that each test
should be responsible for its own environment, at least), it
encourages test writers to *expect* a particular database
configuration that they don't need to set up themselves.  Which is
then surprising for people who want to write tests (/pace/ your point
about documenting test assumptions, below)

I appreciate that this is only bad if you take a particular design
decision about the test suite overall.  If, instead, we want to say
test writers *should* expect a particular database configuration, then
it's OK to end your test by setting stuff up.

I maintain it's better to assume nothing at all, because it means
fewer opportunities for bugs in your test code.  Instead of having to
tear down individual fixtures one-by-one in the code, you can just end
your test by clearing the database out.  There is also a cognitive
dissonance to a test ending by setting stuff up, but that hardly
matters much, I suppose.

> There is no help to debugging by
> making sure the tables are dropped in addition to deleting the data.

The fact we delete from the tables rather than drop them was because
it was quicker, hence naming the flag "faster_db_test_hacks".  I agree
that there is no particular advantage to either approach, though
dropping them allows us to also test migrations etc.  But I don't much
care if we drop or delete data, as long as we start with no data.

Ahhh.  I *think* I see a source of the confusion.

My intent was always to fix the test dependencies so they are
completely independent and assume no state (not that the way I had
done it there was a good way).  David and I were pinging backwards and
forwards things to speed up the database. In the case of Postgres it
turns out deletes are way faster than drops.

I wasn't really paying attention to that as I'm not a postgres or
indeed SQL expert.  Maybe the error lies there somewhere -- the
postgres version isn't cleanly resetting the database on a clean_db
operation?

> I
> maintain that replacing rebuild_db added faff and for the sake of ease of
> writing tests and reducing code to maintain we should revert this.

Hopefully I've explained a bit better why I think it does help: I
think a test should assume nothing about the database other than its
existence.

rebuild_db encoded within it an assumption that a test should end by
*adding data to the database*.

I think it makes tests easier to assume we can end just by emptying
the database.  Then we can lose lots of code to maintain which
previously carefully deleted one-by-one packages that it added during
the tests.

The only faff added by my approach is that you have to take
responsibility for setting up the database correctly for your tests at
the start, and all that means is calling "Init_db" if you want VDM and
a couple of default users set up for you.

> However I think this shows that someone new to our tests needs to be able to
> find our test conventions and intentions documented, so that when there are
> issues (and it is true there are probably some non-independent tests
> lurking) then we can all pull in the same direction to solve them.

I agree with that!

Still, I would prefer that tests that need VDM and a sample User
called init_db() explicitly, and tests that altered the database state
called clean_db() explicitly.  That's all my intention was with all
this stuff; I don't care if it's implemented with tables drops or
database drops or what.

Thanks,

Seb

>>  https://bitbucket.org/okfn/ckan/changeset/e60cf442ad5b#chg-ckan/tests/functional/test_authorization_group.py_oldline27
>>
>> For me, a principle of good unit testing is that it should be possible
>> to run each test independently of the others.  The previous setup
>> required you to run all the tests at once (or a hard-to-discover
>> subset -- some tests ran stand-alone, others didn't).
>>
>> So, instead of setting up one test fixture for all the tests, and then
>> requiring each test to reset the data (sometimes by deleting
>> individual rows, sometimes by recreating the entire fixture set), it
>> felt much more explicit to move to the pattern of every test being
>> responsible for setting up the data it needs.  (Ideally this would
>> mean we'd actually factor out many of the bits of init_db so that
>> individual tests wouldn't have to load the whole lot)
>>
>> That's the thinking!
>>
>> >>> Note for everyone: Seb, James and I have agreed to set in test.ini to
>> >>> use
>> >>> sqlite.
>> >>
>> >> I disagree to this.  You should specify what database you want in your
>> >> own
>> >> development.ini.  The template development.ini could have sqlite as the
>> >> default instead.  This is the only sane way to do it in my opinion.
>> >
>> > Yes this might be better. Anyone else?
>>
>> Ideally I'd go for neither, and just expect people to read the
>> documentation, and chose to run tests against the DB of their choice
>> as they desire.  Really, it's not sane either to develop against a
>> database we never use in production; sqlite was always a hack to
>> encourage people to run the tests *at all*.
>>
>> However, if we are optimising for
>> making-sure-at-least-a-subset-of-the-tests-are-being-run-at-all,
>> putting it in test.ini is preferable (if not entirely "sane";),
>> because people never read the documentation, and getting the tests to
>> run fast in postgres does require people to read the documentation.
>>
>>
>> >>> We think it is a good idea to default tests to being run quickly.
>> >>> However, I can't see a good way to then run tests under postgres - if
>> >>> you
>> >>> comment out the sqlite in test.ini (as the buildbot currently does),
>> >>> 9/10
>> >>> times you end up checking it in by mistake. Maybe someone has a better
>> >>> idea,
>> >>> since passing parameters to nose is a no no.
>> >>
>> >> Why is this a no no?
>> >
>> > From what I read, nose doesn't believe in passing parameters onto tests
>> > -
>> > they should be stateless. You can fudge it by hacking argv, but then you
>> > lose flexibility of the many useful nose command-line params.
>>
>> Have a separate buildbot.ini?
>>
>> Seb
>>
>> _______________________________________________
>> ckan-dev mailing list
>> ckan-dev at lists.okfn.org
>> http://lists.okfn.org/mailman/listinfo/ckan-dev
>
>

-- 
skype: seb.bacon
mobile: 07790 939224
land: 0207 183 9618
web: http://baconconsulting.co.uk