[ckan-dev] Spatial search limits

David Read david.read at hackneyworkshop.com
Wed Apr 18 14:39:08 UTC 2012


On 18 April 2012 15:18, David Read <david.read at hackneyworkshop.com> wrote:
> Just to update on this: the limit we're hitting is exactly 1024
> dataset results. The SOLR logs provide the full error:
>
> too many boolean clauses
> Caused by: org.apache.lucene.search.BooleanQuery$TooManyClauses:
> maxClauseCount is set to 1024
>
> So I've changed the value of maxBooleanClauses in solrconfig.xml from
> 1024 to a higher number. But having restarted Jetty, the error remains
> and still reports 1024, so I'm trying to find out why.

I've solved this now. It appears that this setting cannot change
between SOLR cores, since it is a setting of the (central) Lucene. The
value in the first core's solrconfig.xml seems to be used.

> Maybe I'll hit the 64kb Jetty request limit once this is solved...

We only have 1100 datasets, but reducing the Jetty headerBufferSize
back to the default of 4k doesn't cause any problems. So this must be
an unrelated setting.

David

>
> Dave
>
>
> On 17 April 2012 21:05, David Read <david.read at hackneyworkshop.com> wrote:
>> I've hit a limit with the spatial search provided in ckanext-spatial.
>> If you do a search over a geographical area that has too many results
>> then I get a SOLR exception.
>>
>> Dataset search error: ('SOLR returned an error running
>> query: {\'sort\': \'score desc, name asc\', \'fq\': \'
>> +site_id:"dgu-shazam" +state:active\', \'facet.mincount\'
>> : 1, \'rows\': 11, \'facet.field\': [\'groups\', \'tags\',
>> \'res_format\', \'license\', \'resource-type\', \'UKL
>> P\', \'license_id-is-ogl\', \'publisher\'], \'wt\': \'json\',
>> \'facet.limit\': \'50\', \'facet\': \'true\', \'q\
>> ': u\'(id:ff4cc143-c00a-46b7-81bb-5095372847b6 OR
>> id:2e279200-0ec7-4f15-a2a5-cdc68e277944 OR id:5a705e99-5425-43
>> 37-abbc-195516e32ae4 OR id:0fe66473-8eca-4114-9642-20fb8cc11391 OR ...snip...
>>
>> Jetty has a limit of the size of request that is sent to SOLR. It can
>> be increased to 64k http://drupal.org/node/443980 but by my
>> calculations that sets the limit at about 1500 results, which is still
>> too low for our site.
>>
>> Have you experienced this and do you have any thoughts of how to overcome this?
>>
>> (BTW the relevant ckanext-spatial tests are currently broken due to
>> the new API schema check process. I strongly believe that CKAN should
>> only check that every parameter is provided when submitting by form -
>> it is way too inconvenient in the API and logic layer. This is a form
>> specific thing.)
>>
>> David




More information about the ckan-dev mailing list