[wdmmg-dev] Denormalization and model.mongo.Base.to_ref_dict()

Thu May 12 21:10:39 UTC 2011

Hi,

On Thu, May 12, 2011 at 7:22 PM, Carsten Senger <senger at rehfisch.de> wrote:
> I've written code to do the aggregations for the new api's which saves
> information on the dataset document after the views where applied, and that
> broke a test where to_ref_dict() was used to query an entry. I fixed it
> here: <https://bitbucket.org/okfn/wdmmg/changeset/2547a562f441> and did not
> find it used that way in real code. This makes me wonder if we can reduce
> the amount of data we generate in to_ref_dict() to '_id', 'name', 'label'
> and 'ref' (maybe 'description' too). This reduces the number of cases where
> we would have to update the entries with new values if e.g. an entity
> changes, and it reduces the amount of data stored, indexed and
> serialized/deserialized. Looking through the code I found no place where we
> would need more informaion, but maybe someone know more.

I'm a big +1 with the added proposal of dereferencing the entities and
classifiers in the solr indexer to still get a fully denormalized form
into the index. This means people can still search for
"to.opencorporates_uri" which I don't think is unrealistic at all.

What do you think?

- Friedrich