[ckan-dev] test performance and sqlalchemy complete

David Read david.read at okfn.org
Tue Jan 11 16:59:28 UTC 2011


On 11 January 2011 13:51, Seb Bacon <seb.bacon at gmail.com> wrote:
> On 11 January 2011 13:43, David Read <david.read at okfn.org> wrote:
>> I've now merged Seb and David's code into CKAN core, so we all now can
>> use SQLAlchemy 0.6, SQLite and ultra fast testing. My tests run now in
>> under 3 minutes - super work chaps :-)
>
> Great!
>
>> (I used David's test init improvements, and removed the monkey patch
>> for SQLite accordingly.)
>
> Not that it matters, but FYI (for next time there's a pull) I had
> merged them to my fork already.

Yes I got the first batch of David's vdm changesets from your
bitbucket, but the more recent changes to vdm and ckan I can't see in
your bitbucket - no changes from you there since the afternoon of the
6th, and David provided the patches that evening. Or maybe you didn't
push them?

>> To enable the quick tests, edit your development.ini to use:
>>    sqlalchemy.url = sqlite:///
>
> Everyone note that this *isn't a replacement* for running the tests
> against postgres, because (a) we deploy on postgres, and (b) the
> sqlite test suite doesn't cover sql-based search.  The intended
> workflow is to use the sqlite version during a development cycle, but
> postgres before a push.  You may find the postgres tests pass fast
> enough for your develop cycle tastes on their own (if you turn off
> durability, that is -- see "Test" section in the README.txt)

I assumed the test improvements had in mind Rufus' and James' concerns
about the problems we encountered at the OGConf spring with devs not
running all tests before pushing, causing gridlock. I think the chance
that a breakage that affects Postgres and not Sqlite is going to be
quite rare, so we might as well leave it to the buildbot to check for
these every few hours.

>> If any issues crop up then please do say. Cheers,
>
> Is the next step to deploy to ckan.org and see how it goes?

I've done this now. It looks good.

Dave

>
> Thanks
>
> Seb
>
>
>> On 7 January 2011 12:14, David Raznick <kindly at gmail.com> wrote:
>>>
>>>
>>> On Fri, Jan 7, 2011 at 10:42 AM, Seb Bacon <seb.bacon at gmail.com> wrote:
>>>>
>>>> Great.
>>>>
>>>> No time today to look into it further, but here's some questions
>>>> (patches applied in any case):
>>>>
>>>>  - As I understand it, by turning autoflush off, the UOW code gets the
>>>> chance to collapse the two deletes into one? Is that how the patch
>>>> works?
>>>
>>> Yep.  Within a flush, sqlalchemy makes sure every object is unique, so only
>>> doing one thing for each object. A delete may happen that you do not know
>>> about as its cascaded which only adds the the confusion.
>>>
>>>>
>>>>  - So sqlite exposed rather than caused the error,
>>>>
>>>> because these
>>>> should get collapsed even between flushes?
>>>
>>> Yes.
>>>
>>>>  Does this mean there's
>>>> actually a bug in the postgres sqlalchemy dialect?
>>>
>>> Well they obviously think the dialect is not 'sane' as monkeypatch flag was
>>> supports_sane_multi_row_count.  It looks like all the posgres dialects do
>>> not support it.  So its a postgres thing.  They do support sane_row_count.
>>> The difference is that multi_row_count is on dbapi execute commands and
>>> sane_multi_row_count is for executemany commands.  executemany is faster as
>>> it batches the commands.  Executemany is only used on deletes at the moment,
>>> however in sqlalchemy 0.7 it will be used on inserts too.
>>>
>>>>
>>>>  - Sqlalchemy's autoflush behaviour increasingly seems evil.  Could we
>>>> consider turning it off by default?
>>>
>>>
>>> This would make me happy as they are a pain to debug and slower too. There
>>> are about 40 times in ckan (found using grep) where it is explicitly
>>> switched off as it was causing problems.  However its not as trivial as I
>>> hoped, there is fair bit of code relying on it.  If I get the time I will
>>> have a go.
>>>>
>>>> Re. the postgres drop table thing: I thought I'd already done that
>>>> myself (I'd done something like it, backed out the changes to compare
>>>> speed, and forgot to reinstate them) so thanks for spotting that :)
>>>> The postgres tests now run in 10 minutes for me.
>>>>
>>>> Out of interest, I also did a proper like-for-like test comparing
>>>> postgres and sqlite against the same tests (e.g. skipping all the
>>>> search-related tests for both).  Sqlite is still about 20% faster,
>>>> though this difference stops being so significant when test times are
>>>> well under 10 minutes.
>>>>
>>>> Thanks,
>>>>
>>>> Seb
>>>>
>>>> On 7 January 2011 01:14, David Raznick <kindly at gmail.com> wrote:
>>>> > I have looked at the error and now think the monkeypatch is not the
>>>> > right
>>>> > choice as the errors are correct.
>>>> >
>>>> > The errors are due to us deleting the same row in the database twice.
>>>> > i.e
>>>> > doing  "delete from package where package.id = 1"  twice.  We should not
>>>> > be
>>>> > doing this.  In postgres, this does not cause an error because it does
>>>> > not
>>>> > report accurately how many rows it deleted to sqlalchemy. So sqlalchemy
>>>> > has
>>>> > no way of knowing if a delete actually did not work (i.e it already
>>>> > happened).  Sqlite does report this accurately and it shows that the
>>>> > tests
>>>> > and code are doing a lot of deleting the same row twice. The monkeypatch
>>>> > just turns this checking off.
>>>> >
>>>> > Anyway in ticket 868 I have attached patches for the vdm and ckan so the
>>>> > tests pass without the monkeypatch.  The issue were mainly due to
>>>> > autoflushes deleting from a cascade and then later we did the delete
>>>> > again.
>>>> > This patch has the benefit of working on sqlalchemy 0.5, it also means
>>>> > we do
>>>> > less flushing/work so things are (very slightly) faster.
>>>> >
>>>> > Also....
>>>> >
>>>> > I have attached another new patch to 868  that brings the postgres times
>>>> > down from 14 mins to 9 mins on my laptop that works off Sebs branch.
>>>> >
>>>> > David
>>>> >
>>>> >
>>>> >
>>>> >
>>>> >
>>>> >
>>>> > On Thu, Jan 6, 2011 at 5:17 PM, James Gardner <james at 3aims.com> wrote:
>>>> >>
>>>> >> I'm personally fine with this particular monkey patch and in this
>>>> >> specific case think it is less confusing than installing a forked
>>>> >> dependency.
>>>> >>
>>>> >> James
>>>> >>
>>>> >>
>>>> >> On Thu, 2011-01-06 at 16:56 +0000, Seb Bacon wrote:
>>>> >> > On 6 January 2011 16:12, William Waites <ww at eris.okfn.org> wrote:
>>>> >> > > * [2011-01-06 14:28:03 +0000] Seb Bacon <seb.bacon at gmail.com>
>>>> >> > > écrit:
>>>> >> > >
>>>> >> > > ] I've commited a monkeypatch now.
>>>> >> > >
>>>> >> > > please roll that back and just put a patched sqlalchemy up and
>>>> >> > > refer
>>>> >> > > to it in the pip requirements.
>>>> >> > >
>>>> >> > > third party apps monkey patching libraries wreak havoc with testing
>>>> >> > > local changes, etc, to those libraries.
>>>> >> > >
>>>> >> > > please do not do this.
>>>> >> >
>>>> >> > While I agree with the general point about monkeypatches:
>>>> >> >
>>>> >> > (a) this patch, which only involves changing a flag from True to
>>>> >> > False, is only applied when running tests against a sqlite backend.
>>>> >> >  I
>>>> >> > can't think of a scenario whereby this would wreak any havoc?
>>>> >> >
>>>> >> > (b) putting up a patched tarball of sqlalchemy isn't painless either.
>>>> >> > It creates a bit more cognitive-maintenance overhead for when we do
>>>> >> > sqlalchemy updates (e.g. how will whoever does an update in a year
>>>> >> > know why we have our own version? should we maintain the patch next
>>>> >> > to
>>>> >> > it somewhere for that? etc)
>>>> >> >
>>>> >> > The point in (b) isn't a big deal, but I maintain neither is the
>>>> >> > potential issue with a monkeypatch, *in this specific case*.  Plus, I
>>>> >> > haven't got time to think about it again until next week ;-)
>>>> >> >
>>>> >> > Have a look at the patch
>>>> >> > (https://bitbucket.org/sebbacon/ckan/changeset/ad9f51642166), and if
>>>> >> > you're still convinced it might screw things up I'll happily change
>>>> >> > it
>>>> >> > next week.
>>>> >> >
>>>> >> > Thanks,
>>>> >> >
>>>> >> > Seb
>>>> >> >
>>>> >> > _______________________________________________
>>>> >> > ckan-dev mailing list
>>>> >> > ckan-dev at lists.okfn.org
>>>> >> > http://lists.okfn.org/mailman/listinfo/ckan-dev
>>>> >>
>>>> >>
>>>> >>
>>>> >> _______________________________________________
>>>> >> ckan-dev mailing list
>>>> >> ckan-dev at lists.okfn.org
>>>> >> http://lists.okfn.org/mailman/listinfo/ckan-dev
>>>> >
>>>> >
>>>> > _______________________________________________
>>>> > ckan-dev mailing list
>>>> > ckan-dev at lists.okfn.org
>>>> > http://lists.okfn.org/mailman/listinfo/ckan-dev
>>>> >
>>>> >
>>>>
>>>>
>>>>
>>>> --
>>>> skype: seb.bacon
>>>> mobile: 07790 939224
>>>> land: 0207 183 9618
>>>> web: http://baconconsulting.co.uk
>>>>
>>>> _______________________________________________
>>>> ckan-dev mailing list
>>>> ckan-dev at lists.okfn.org
>>>> http://lists.okfn.org/mailman/listinfo/ckan-dev
>>>
>>>
>>> _______________________________________________
>>> ckan-dev mailing list
>>> ckan-dev at lists.okfn.org
>>> http://lists.okfn.org/mailman/listinfo/ckan-dev
>>>
>>>
>>
>
>
>
> --
> skype: seb.bacon
> mobile: 07790 939224
> land: 0207 183 9618
> web: http://baconconsulting.co.uk
>




More information about the ckan-dev mailing list