[ckan-dev] test performance and sqlalchemy complete

Seb Bacon seb.bacon at gmail.com
Tue Jan 11 13:51:17 UTC 2011


On 11 January 2011 13:43, David Read <david.read at okfn.org> wrote:
> I've now merged Seb and David's code into CKAN core, so we all now can
> use SQLAlchemy 0.6, SQLite and ultra fast testing. My tests run now in
> under 3 minutes - super work chaps :-)

Great!

> (I used David's test init improvements, and removed the monkey patch
> for SQLite accordingly.)

Not that it matters, but FYI (for next time there's a pull) I had
merged them to my fork already.

> To enable the quick tests, edit your development.ini to use:
>    sqlalchemy.url = sqlite:///

Everyone note that this *isn't a replacement* for running the tests
against postgres, because (a) we deploy on postgres, and (b) the
sqlite test suite doesn't cover sql-based search.  The intended
workflow is to use the sqlite version during a development cycle, but
postgres before a push.  You may find the postgres tests pass fast
enough for your develop cycle tastes on their own (if you turn off
durability, that is -- see "Test" section in the README.txt)

> If any issues crop up then please do say. Cheers,

Is the next step to deploy to ckan.org and see how it goes?

Thanks

Seb


> On 7 January 2011 12:14, David Raznick <kindly at gmail.com> wrote:
>>
>>
>> On Fri, Jan 7, 2011 at 10:42 AM, Seb Bacon <seb.bacon at gmail.com> wrote:
>>>
>>> Great.
>>>
>>> No time today to look into it further, but here's some questions
>>> (patches applied in any case):
>>>
>>>  - As I understand it, by turning autoflush off, the UOW code gets the
>>> chance to collapse the two deletes into one? Is that how the patch
>>> works?
>>
>> Yep.  Within a flush, sqlalchemy makes sure every object is unique, so only
>> doing one thing for each object. A delete may happen that you do not know
>> about as its cascaded which only adds the the confusion.
>>
>>>
>>>  - So sqlite exposed rather than caused the error,
>>>
>>> because these
>>> should get collapsed even between flushes?
>>
>> Yes.
>>
>>>  Does this mean there's
>>> actually a bug in the postgres sqlalchemy dialect?
>>
>> Well they obviously think the dialect is not 'sane' as monkeypatch flag was
>> supports_sane_multi_row_count.  It looks like all the posgres dialects do
>> not support it.  So its a postgres thing.  They do support sane_row_count.
>> The difference is that multi_row_count is on dbapi execute commands and
>> sane_multi_row_count is for executemany commands.  executemany is faster as
>> it batches the commands.  Executemany is only used on deletes at the moment,
>> however in sqlalchemy 0.7 it will be used on inserts too.
>>
>>>
>>>  - Sqlalchemy's autoflush behaviour increasingly seems evil.  Could we
>>> consider turning it off by default?
>>
>>
>> This would make me happy as they are a pain to debug and slower too. There
>> are about 40 times in ckan (found using grep) where it is explicitly
>> switched off as it was causing problems.  However its not as trivial as I
>> hoped, there is fair bit of code relying on it.  If I get the time I will
>> have a go.
>>>
>>> Re. the postgres drop table thing: I thought I'd already done that
>>> myself (I'd done something like it, backed out the changes to compare
>>> speed, and forgot to reinstate them) so thanks for spotting that :)
>>> The postgres tests now run in 10 minutes for me.
>>>
>>> Out of interest, I also did a proper like-for-like test comparing
>>> postgres and sqlite against the same tests (e.g. skipping all the
>>> search-related tests for both).  Sqlite is still about 20% faster,
>>> though this difference stops being so significant when test times are
>>> well under 10 minutes.
>>>
>>> Thanks,
>>>
>>> Seb
>>>
>>> On 7 January 2011 01:14, David Raznick <kindly at gmail.com> wrote:
>>> > I have looked at the error and now think the monkeypatch is not the
>>> > right
>>> > choice as the errors are correct.
>>> >
>>> > The errors are due to us deleting the same row in the database twice.
>>> > i.e
>>> > doing  "delete from package where package.id = 1"  twice.  We should not
>>> > be
>>> > doing this.  In postgres, this does not cause an error because it does
>>> > not
>>> > report accurately how many rows it deleted to sqlalchemy. So sqlalchemy
>>> > has
>>> > no way of knowing if a delete actually did not work (i.e it already
>>> > happened).  Sqlite does report this accurately and it shows that the
>>> > tests
>>> > and code are doing a lot of deleting the same row twice. The monkeypatch
>>> > just turns this checking off.
>>> >
>>> > Anyway in ticket 868 I have attached patches for the vdm and ckan so the
>>> > tests pass without the monkeypatch.  The issue were mainly due to
>>> > autoflushes deleting from a cascade and then later we did the delete
>>> > again.
>>> > This patch has the benefit of working on sqlalchemy 0.5, it also means
>>> > we do
>>> > less flushing/work so things are (very slightly) faster.
>>> >
>>> > Also....
>>> >
>>> > I have attached another new patch to 868  that brings the postgres times
>>> > down from 14 mins to 9 mins on my laptop that works off Sebs branch.
>>> >
>>> > David
>>> >
>>> >
>>> >
>>> >
>>> >
>>> >
>>> > On Thu, Jan 6, 2011 at 5:17 PM, James Gardner <james at 3aims.com> wrote:
>>> >>
>>> >> I'm personally fine with this particular monkey patch and in this
>>> >> specific case think it is less confusing than installing a forked
>>> >> dependency.
>>> >>
>>> >> James
>>> >>
>>> >>
>>> >> On Thu, 2011-01-06 at 16:56 +0000, Seb Bacon wrote:
>>> >> > On 6 January 2011 16:12, William Waites <ww at eris.okfn.org> wrote:
>>> >> > > * [2011-01-06 14:28:03 +0000] Seb Bacon <seb.bacon at gmail.com>
>>> >> > > écrit:
>>> >> > >
>>> >> > > ] I've commited a monkeypatch now.
>>> >> > >
>>> >> > > please roll that back and just put a patched sqlalchemy up and
>>> >> > > refer
>>> >> > > to it in the pip requirements.
>>> >> > >
>>> >> > > third party apps monkey patching libraries wreak havoc with testing
>>> >> > > local changes, etc, to those libraries.
>>> >> > >
>>> >> > > please do not do this.
>>> >> >
>>> >> > While I agree with the general point about monkeypatches:
>>> >> >
>>> >> > (a) this patch, which only involves changing a flag from True to
>>> >> > False, is only applied when running tests against a sqlite backend.
>>> >> >  I
>>> >> > can't think of a scenario whereby this would wreak any havoc?
>>> >> >
>>> >> > (b) putting up a patched tarball of sqlalchemy isn't painless either.
>>> >> > It creates a bit more cognitive-maintenance overhead for when we do
>>> >> > sqlalchemy updates (e.g. how will whoever does an update in a year
>>> >> > know why we have our own version? should we maintain the patch next
>>> >> > to
>>> >> > it somewhere for that? etc)
>>> >> >
>>> >> > The point in (b) isn't a big deal, but I maintain neither is the
>>> >> > potential issue with a monkeypatch, *in this specific case*.  Plus, I
>>> >> > haven't got time to think about it again until next week ;-)
>>> >> >
>>> >> > Have a look at the patch
>>> >> > (https://bitbucket.org/sebbacon/ckan/changeset/ad9f51642166), and if
>>> >> > you're still convinced it might screw things up I'll happily change
>>> >> > it
>>> >> > next week.
>>> >> >
>>> >> > Thanks,
>>> >> >
>>> >> > Seb
>>> >> >
>>> >> > _______________________________________________
>>> >> > ckan-dev mailing list
>>> >> > ckan-dev at lists.okfn.org
>>> >> > http://lists.okfn.org/mailman/listinfo/ckan-dev
>>> >>
>>> >>
>>> >>
>>> >> _______________________________________________
>>> >> ckan-dev mailing list
>>> >> ckan-dev at lists.okfn.org
>>> >> http://lists.okfn.org/mailman/listinfo/ckan-dev
>>> >
>>> >
>>> > _______________________________________________
>>> > ckan-dev mailing list
>>> > ckan-dev at lists.okfn.org
>>> > http://lists.okfn.org/mailman/listinfo/ckan-dev
>>> >
>>> >
>>>
>>>
>>>
>>> --
>>> skype: seb.bacon
>>> mobile: 07790 939224
>>> land: 0207 183 9618
>>> web: http://baconconsulting.co.uk
>>>
>>> _______________________________________________
>>> ckan-dev mailing list
>>> ckan-dev at lists.okfn.org
>>> http://lists.okfn.org/mailman/listinfo/ckan-dev
>>
>>
>> _______________________________________________
>> ckan-dev mailing list
>> ckan-dev at lists.okfn.org
>> http://lists.okfn.org/mailman/listinfo/ckan-dev
>>
>>
>



-- 
skype: seb.bacon
mobile: 07790 939224
land: 0207 183 9618
web: http://baconconsulting.co.uk




More information about the ckan-dev mailing list