[ckan-dev] Only specific field in a search result.

Alex (Maxious) Sadleir maxious at gmail.com
Thu Jan 22 03:11:46 UTC 2015


Just so I don't lose it; this is a patch to make the ckan datastore
COUNT(*) approximate and only recalculated by ANALYZE; you could also
patch to do this yourself when the data is updated but it's done by
autovacuum anyway.
I got this from https://wiki.postgresql.org/wiki/Slow_Counting

diff --git a/ckanext/datastore/db.py b/ckanext/datastore/db.py
index bec642c..ac6175c 100644
--- a/ckanext/datastore/db.py
+++ b/ckanext/datastore/db.py
@@ -920,8 +920,8 @@ def search_data(context, data_dict):

     sort = _sort(context, data_dict, field_ids)

-    sql_string = u'''SELECT {select}, count(*) over() AS "_full_count" {rank}
-                    FROM "{resource}" {ts_query}
+    sql_string = u'''SELECT {select},c.reltuples  AS "_full_count" {rank}
+                    FROM "{resource}",(SELECT reltuples FROM pg_class
WHERE relname = '{resource}') as c {ts_query}
                     {where} {sort} LIMIT {limit} OFFSET {offset}'''.format(
         select=select_columns,
         rank=rank_column,

On Thu, Jan 8, 2015 at 8:32 PM, Alice Heaton <a.heaton at nhm.ac.uk> wrote:
> Hi,
>
> Unfortunately not. CKAN datastore is quite slow on large datasets.
>
> Is this your own CKAN server? If so you can try a few things:
>
> - Make sure your datastore database is vacuumed;
> - Use PostgreSQL 9.3 (or even 9.4) which will give better performance;
> - Look into indexes if you are doing searches.
>
> But this only goes so far. Annoyingly the main thing that causes the CKAN datastore queries to be so slow is
> the COUNT(*) included in every query. This is down to the way PostgreSQL works, and has been improved
> in PostgreSQL 9.2 (I think - I can confirm it's better on 9.3!).
>
> Even with PostgreSQL 9.3 though the COUNT(*) has a significant costs. This is annoying because in most cases a
> COUNT(*) is actually not needed. At most it is needed once - not for every page requests, and in most cases
> you don't need it at all, you just loop requesting pages until there's no more data.
>
> We have a patch somewhere that allows CKAN plugin to remove the COUNT(*) - we use this for internal queries
> that use the datastore API, and it makes a significant performance improvement. I can send you the
> patch (we'll submit it to CKAN when I get a minute!). With the patch you will need to write your own plugin implementing
> the IDatastore interface to decide when a COUNT(*) should or shouldn't be included.
>
> Best Wishes,
> Alice
>
>
> ________________________________________
> From: ckan-dev [ckan-dev-bounces at lists.okfn.org] on behalf of BLANC Antoine [blanc at kdl.co.jp]
> Sent: 08 January 2015 08:27
> To: ckan-dev at lists.okfn.org
> Subject: [ckan-dev] Only specific field in a search result.
>
> Hello everyone,
>
> I'm developing an web application that use the CKAN API. I make some
> search through the API to get datasets corresponding to my query. And it
> works perfectly.
> But I need only a few fields from each dataset.
> So I filter the result in a kind of middleware to get only the fields
> that  I need to send to my JavaScript part.
> This CKAN is very large (1.8 million datasets), the result can be very
> slow. So I am trying to remove some treatments and reduce the network
> traffic.
>
> Is it possible get only those fields directly from the API ?
>
> And of course, my web application isn't the only application that use
> this CKAN, so I can't modify deeply its configuration
>
> Thanks in advance for your help
>
> Best regards
>
> Antoine.
>
> --
> ____________________________________________
> アントワーヌ ブラン [Antoine BLANC]
> 株式会社 神戸デジタル・ラボ
> 〒650-0033 神戸市中央区江戸町93番 栄光ビル5F
> TEL:078-327-2280
> FAX:078-327-2278
> @ : blanc at kdl.co.jp
> ____________________________________________
>
> _______________________________________________
> ckan-dev mailing list
> ckan-dev at lists.okfn.org
> https://lists.okfn.org/mailman/listinfo/ckan-dev
> Unsubscribe: https://lists.okfn.org/mailman/options/ckan-dev
> _______________________________________________
> ckan-dev mailing list
> ckan-dev at lists.okfn.org
> https://lists.okfn.org/mailman/listinfo/ckan-dev
> Unsubscribe: https://lists.okfn.org/mailman/options/ckan-dev



More information about the ckan-dev mailing list