[wdmmg-dev] Solr Schema changes / replace solrpy with sunburnt?

Carsten Senger senger at rehfisch.de
Sat Apr 2 13:45:46 UTC 2011


Schema changes
--------------

I've pushed the features-64-refactor-distinct-queries branch for
<http://trac.openspending.org/ticket/64>.

It required 1 change to the solr schema so far:
<https://bitbucket.org/okfn/wdmmg/changeset/8d0f659e6ed6>
It will index all additional fields as complete strings, not as splitted
text. You have to update the schema xml file in your solr instance and
reindex all datasets.

We should add release notes where we can bigger changes, especially backward
incompatible ones like this.

The content of the files is also copied to the 'text' field so it is 
available
tokenized for full text search. Can this schema change cause any problems?

I've some more requests to change the schema:

* wdmmg.model.Base.to_index_dict()
 
<https://bitbucket.org/okfn/wdmmg/src/1b5c1339637e/wdmmg/model/mongo.py#cl-118>
  rewrites "._id" to ".id". Is that necessary?

  Much of this might be hidden from the user in .logic (or whatever it will
  be named in the end), but from time to time it might be necessary to 
write
  queries by hand, and such differences give nasty little bugs.

  I'd be happy if we could pass dicts with unchanged keys to solr.

* is_aggregate is a boolean value in mongodb, but a string in solr
  (u'true'/u'false').  I'd like to change it to a boolean field in mongodb.

* The schema seems to define some fields that to not used (like fields
  ending with '_facet'. Can we clean up the schema a bit?


Use sunburnt
------------

I think we should use sunburnt instead of solrpy. I've added a comment about
that in the ticket: <http://trac.openspending.org/ticket/64>.

..Carsten




More information about the openspending-dev mailing list