[ckan-dev] Large and slow CKAN instance

Milica Knezevic knezevic.milica at gmail.com
Mon Oct 27 15:43:19 UTC 2014


Hi all,

I have a CKAN instance (2.2) with ~90,000 datasets. Most of these datasets
(~80,000) belong to one organization. With an increase of number of
datasets, everything becomes utterly slow. What happens is that postgres
becomes very busy for a long period of time. I set up db logging (queries
that takes more than 2 seconds to execute are logged) and it seems that
this query is executed each time I try to access, well, anything, even the
About page. It takes more than 7 seconds to execute and problematic are
external sorts and merge. I don't think I should increase Postgres work_mem
that much... Here's the logged query and the result of explain analyze. If
anyone has any suggestion it would be great.

 duration: 7731.616 ms  statement: SELECT anon_1.anon_2_activity_id AS
anon_1_anon_2_activity_id, anon_1.anon_2_activity_timestamp AS
anon_1_anon_2_activity_timestamp, anon_1.anon_2_activity_user_id AS
anon_1_anon_2_activity_user_id, anon_1.anon_2_activity_object_id AS
anon_1_anon_2_activity_object_id, anon_1.anon_2_activity_revision_id AS
anon_1_anon_2_activity_revision_id, anon_1.anon_2_activity_activity_type AS
anon_1_anon_2_activity_activity_type, anon_1.anon_2_activity_data AS
anon_1_anon_2_activity_data
        FROM (SELECT anon_2.activity_id AS anon_2_activity_id,
anon_2.activity_timestamp AS anon_2_activity_timestamp,
anon_2.activity_user_id AS anon_2_activity_user_id,
anon_2.activity_object_id AS anon_2_activity_object_id,
anon_2.activity_revision_id AS anon_2_activity_revision_id,
anon_2.activity_activity_type AS anon_2_activity_activity_type,
anon_2.activity_data AS anon_2_activity_data
        FROM (SELECT activity.id AS activity_id, activity.timestamp AS
activity_timestamp, activity.user_id AS activity_user_id,
activity.object_id AS activity_object_id, activity.revision_id AS
activity_revision_id, activity.activity_type AS activity_activity_type,
activity.data AS activity_data
        FROM activity
        WHERE activity.user_id = '8842c6bf-52be-450f-a615-ba4e5091925b'
UNION SELECT activity.id AS activity_id, activity.timestamp AS
activity_timestamp, activity.user_id AS activity_user_id,
activity.object_id AS activity_object_id, activity.revision_id AS
activity_revision_id, activity.activity_type AS activity_activity_type,
activity.data AS activity_data
        FROM activity
        WHERE activity.object_id = '8842c6bf-52be-450f-a615-ba4e5091925b')
AS anon_2 UNION SELECT anon_3.anon_4_activity_id AS
anon_3_anon_4_activity_id, anon_3.anon_4_activity_timestamp AS
anon_3_anon_4_activity_timestamp, anon_3.anon_4_activity_user_id AS
anon_3_anon_4_activity_user_id, anon_3.anon_4_activity_object_id AS
anon_3_anon_4_activity_object_id, anon_3.anon_4_activity_revision_id AS
anon_3_anon_4_activity_revision_id, anon_3.anon_4_activity_activity_type AS
anon_3_anon_4_activity_activity_type, anon_3.anon_4_activity_data AS
anon_3_anon_4_activity_data
        FROM (SELECT anon_4.activity_id AS anon_4_activity_id,
anon_4.activity_timestamp AS anon_4_activity_timestamp,
anon_4.activity_user_id AS anon_4_activity_user_id,
anon_4.activity_object_id AS anon_4_activity_object_id,
anon_4.activity_revision_id AS anon_4_activity_revision_id,
anon_4.activity_activity_type AS anon_4_activity_activity_type,
anon_4.activity_data AS anon_4_activity_data
        FROM (SELECT activity.id AS activity_id, activity.timestamp AS
activity_timestamp, activity.user_id AS activity_user_id,
activity.object_id AS activity_object_id, activity.revision_id AS
activity_revision_id, activity.activity_type AS activity_activity_type,
activity.data AS activity_data
        FROM activity
        WHERE 0=1 UNION SELECT activity.id AS activity_id,
activity.timestamp AS activity_timestamp, activity.user_id AS
activity_user_id, activity.object_id AS activity_object_id,
activity.revision_id AS activity_revision_id, activity.activity_type AS
activity_activity_type, activity.data AS activity_data
        FROM activity
        WHERE 0=1) AS anon_4 UNION SELECT activity.id AS activity_id,
activity.timestamp AS activity_timestamp, activity.user_id AS
activity_user_id, activity.object_id AS activity_object_id,
activity.revision_id AS activity_revision_id, activity.activity_type AS
activity_activity_type, activity.data AS activity_data
        FROM activity
        WHERE 0=1) AS anon_3) AS anon_1 ORDER BY
anon_1.anon_2_activity_timestamp DESC
         LIMIT 31


QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=451353.67..451353.74 rows=31 width=200) (actual
time=8247.617..8247.631 rows=31 loops=1)
   ->  Sort  (cost=451353.67..451836.39 rows=193089 width=200) (actual
time=8247.615..8247.623 rows=31 loops=1)
         Sort Key: public.activity."timestamp"
         Sort Method: top-N heapsort  Memory: 56kB
         ->  Unique  (cost=439812.55..443674.33 rows=193089 width=200)
(actual time=7747.816..8173.816 rows=205989 loops=1)
               ->  Sort  (cost=439812.55..440295.27 rows=193089 width=200)
(actual time=7747.814..8022.633 rows=205989 loops=1)
                     Sort Key: public.activity.id,
public.activity."timestamp", public.activity.user_id,
public.activity.object_id, public.activity.revision_id,
public.activity.activity_type, public.activity.data
                     Sort Method: external sort  Disk: 161680kB
                     ->  Append  (cost=257882.42..385900.39 rows=193089
width=200) (actual time=4257.847..5240.706 rows=205989 loops=1)
                           ->  Unique  (cost=257882.42..261744.14
rows=193086 width=770) (actual time=4257.847..5186.553 rows=205989 loops=1)
                                 ->  Sort  (cost=257882.42..258365.13
rows=193086 width=770) (actual time=4257.843..5005.656 rows=205993 loops=1)
                                       Sort Key: public.activity.id,
public.activity."timestamp", public.activity.user_id,
public.activity.object_id, public.activity.revision_id,
public.activity.activity_type, public.activity.data
                                       Sort Method: external merge  Disk:
161688kB
                                       ->  Append  (cost=0.00..42932.04
rows=193086 width=770) (actual time=0.032..507.962 rows=205993 loops=1)
                                             ->  Seq Scan on activity
(cost=0.00..40984.11 rows=193082 width=770) (actual time=0.031..451.479
rows=205989 loops=1)
                                                   Filter: (user_id =
'8842c6bf-52be-450f-a615-ba4e5091925b'::text)
                                             ->  Index Scan using
idx_activity_object_id on activity  (cost=0.00..17.07 rows=4 width=770)
(actual time=0.109..0.144 rows=4 loops=1)
                                                   Index Cond: (object_id =
'8842c6bf-52be-450f-a615-ba4e5091925b'::text)
                           ->  HashAggregate  (cost=120294.44..120294.47
rows=3 width=390) (actual time=0.056..0.056 rows=0 loops=1)
                                 ->  Append  (cost=80196.21..120294.39
rows=3 width=390) (actual time=0.053..0.053 rows=0 loops=1)
                                       ->  Unique  (cost=80196.21..80196.25
rows=2 width=770) (actual time=0.051..0.051 rows=0 loops=1)
                                             ->  Sort
(cost=80196.21..80196.21 rows=2 width=770) (actual time=0.049..0.049 rows=0
loops=1)
                                                   Sort Key:
public.activity.id, public.activity."timestamp", public.activity.user_id,
public.activity.object_id, public.activity.revision_id,
public.activity.activity_type, public.activity.data
                                                   Sort Method: quicksort
Memory: 25kB
                                                   ->  Append
(cost=0.00..80196.20 rows=2 width=770) (actual time=0.005..0.005 rows=0
loops=1)
                                                         ->  Result
(cost=0.00..40098.09 rows=1 width=770) (actual time=0.002..0.002 rows=0
loops=1)
                                                               One-Time
Filter: false
                                                               ->  Seq Scan
on activity  (cost=0.00..40098.09 rows=1 width=770) (never executed)
                                                         ->  Result
(cost=0.00..40098.09 rows=1 width=770) (actual time=0.000..0.000 rows=0
loops=1)
                                                               One-Time
Filter: false
                                                               ->  Seq Scan
on activity  (cost=0.00..40098.09 rows=1 width=770) (never executed)
                                       ->  Result  (cost=0.00..40098.09
rows=1 width=770) (actual time=0.001..0.001 rows=0 loops=1)
                                             One-Time Filter: false
                                             ->  Seq Scan on activity
(cost=0.00..40098.09 rows=1 width=770) (never executed)
 Total runtime: 8383.038 ms
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.okfn.org/pipermail/ckan-dev/attachments/20141027/3fa16d96/attachment-0002.html>


More information about the ckan-dev mailing list