[wdmmg-dev] Denormalization and model.mongo.Base.to_ref_dict()

Fri May 13 11:02:06 UTC 2011

--On Donnerstag, Mai 12, 2011 23:10:39 +0200 Friedrich Lindenberg
<friedrich.lindenberg at okfn.org> wrote:

> Hi,
>
> On Thu, May 12, 2011 at 7:22 PM, Carsten Senger <senger at rehfisch.de>
> wrote:
>> I've written code to do the aggregations for the new api's which saves
>> information on the dataset document after the views where applied, and
>> that broke a test where to_ref_dict() was used to query an entry. I
>> fixed it here: <https://bitbucket.org/okfn/wdmmg/changeset/2547a562f441>
>> and did not find it used that way in real code. This makes me wonder if
>> we can reduce the amount of data we generate in to_ref_dict() to '_id',
>> 'name', 'label' and 'ref' (maybe 'description' too). This reduces the
>> number of cases where we would have to update the entries with new
>> values if e.g. an entity changes, and it reduces the amount of data
>> stored, indexed and
>> serialized/deserialized. Looking through the code I found no place where
>> we would need more informaion, but maybe someone know more.
>
> I'm a big +1 with the added proposal of dereferencing the entities and
> classifiers in the solr indexer to still get a fully denormalized form
> into the index. This means people can still search for
> "to.opencorporates_uri" which I don't think is unrealistic at all.
>
> What do you think?

Oh, you're right. And it's a good suggestion. We can then reduce the data
to 'id', 'name', 'label' and 'ref' (and 'color' as long as we have this
data stored and want to use it for the bubblechart).

..Carsten