[ckan-dev] [#2963] Timeout on tag pages with lots of datasets

Vitor Baptista vitor at vitorbaptista.com
Fri Jan 11 23:50:37 UTC 2013


Hi,

I was looking into this ticket today. Basically, if a tag has many datasets
(i.e. http://thedatahub.org/tag/lod), it times out. Looking into the code,
I found a few of n+1s (log attached), but could only fix one so far (
https://github.com/vitorbaptista/ckan/commit/424ac7703fb58202260e4a8cb7ee31cd01a3962d
).

The main culprit for these queries (~13 per package) is
model_dictize.package_dictize. I've tried to refactor it a bit, so I could
start optimizing, but I had a few problems. The main one is that the
relationships are created manually (i.e. Tag.packages), so it's not
possible to eager load the relationships without changing every method. Any
specific reason it was done like this, instead of using SQLAlchemy?

Thanks,
Vítor.

P.S.: Next time, should I comment here, on trac, or on a github issue?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.okfn.org/pipermail/ckan-dev/attachments/20130111/009d973c/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: loading_tags_datasets.log
Type: application/octet-stream
Size: 17626 bytes
Desc: not available
URL: <http://lists.okfn.org/pipermail/ckan-dev/attachments/20130111/009d973c/attachment-0002.obj>


More information about the ckan-dev mailing list