[ckan-changes] commit/ckan: amercader: [search] Use a unique index_id for each indexed document (See #1430)

Bitbucket commits-noreply at bitbucket.org
Fri Nov 4 14:17:11 UTC 2011


1 new commit in ckan:


https://bitbucket.org/okfn/ckan/changeset/855f5a452f60/
changeset:   855f5a452f60
branch:      defect-1430-mixed-docs-in-search-index
user:        amercader
date:        2011-11-04 15:08:27
summary:     [search] Use a unique index_id for each indexed document (See #1430)
affected #:  3 files

diff -r 97e1e90d66d745f44f4ce4a902640efef659d695 -r 855f5a452f603ac1a4bd9890901f656213a10c08 ckan/config/schema.xml
--- a/ckan/config/schema.xml
+++ b/ckan/config/schema.xml
@@ -91,6 +91,7 @@
 
 
 <fields>
+    <field name="index_id" type="string" indexed="true" stored="true" required="true" /><field name="id" type="string" indexed="true" stored="true" required="true" /><field name="site_id" type="string" indexed="true" stored="true" required="true" /><field name="title" type="text" indexed="true" stored="true" />
@@ -138,7 +139,7 @@
     <dynamicField name="*" type="string" indexed="true"  stored="false"/></fields>
 
-<uniqueKey>id</uniqueKey>
+<uniqueKey>index_id</uniqueKey><defaultSearchField>text</defaultSearchField><solrQueryParser defaultOperator="AND"/>
 


diff -r 97e1e90d66d745f44f4ce4a902640efef659d695 -r 855f5a452f603ac1a4bd9890901f656213a10c08 ckan/lib/search/index.py
--- a/ckan/lib/search/index.py
+++ b/ckan/lib/search/index.py
@@ -139,6 +139,10 @@
         # mark this CKAN instance as data source:
         pkg_dict['site_id'] = config.get('ckan.site_id')
         
+        # add a unique index_id to avoid conflicts
+        import hashlib
+        pkg_dict['index_id'] = hashlib.md5('%s%s' % (pkg_dict['id'],config.get('ckan.site_id'))).hexdigest()
+
         # send to solr:  
         try:
             conn.add_many([pkg_dict])


diff -r 97e1e90d66d745f44f4ce4a902640efef659d695 -r 855f5a452f603ac1a4bd9890901f656213a10c08 ckan/tests/lib/test_solr_search_index.py
--- a/ckan/tests/lib/test_solr_search_index.py
+++ b/ckan/tests/lib/test_solr_search_index.py
@@ -1,3 +1,4 @@
+import hashlib
 import socket
 import solr
 from pylons import config
@@ -47,6 +48,9 @@
     def teardown(self):
         # clear the search index after every test
         search.index_for('Package').clear()
+    
+    def _get_index_id(self,pkg_id):
+        return hashlib.md5('%s%s' % (pkg_id,config['ckan.site_id'])).hexdigest()
 
     def test_index(self):
         pkg_dict = {
@@ -57,6 +61,7 @@
         search.dispatch_by_operation('Package', pkg_dict, 'new')
         response = self.solr.query('title:penguin', fq=self.fq)
         assert len(response) == 1, len(response)
+        assert response.results[0]['index_id'] == self._get_index_id (pkg_dict['id'])
         assert response.results[0]['title'] == 'penguin'
 
     def test_no_state_not_indexed(self):

Repository URL: https://bitbucket.org/okfn/ckan/

--

This is a commit notification from bitbucket.org. You are receiving
this because you have the service enabled, addressing the recipient of
this email.




More information about the ckan-changes mailing list