[wdmmg-discuss] Aggregated vs per-capita statistics

Wed Apr 14 14:27:41 UTC 2010

(For interest only).

One of the new requirements for the aggregator is to calculate 
spending-per-capita figures. This is a two step process:

1. *Sum* spending and population over all irrelevant variables.

2. *Divide* spending by population.

It is important to do the sum before the divide. After doing the division, 
no further aggregation is possible, unless the divisor (population) is 
constant along the axis you're summing over. I have made a little 
spreadsheet (attached) to help me understand what's going on.

This phenomenon means the aggregation sometimes has to be done exactly 
right first time by the data store; there's no later opportunity to do 
further custom aggregation of any sort in the presentation layer.

The question I'm battling with is whether this forces the data store to 
understand anything it doesn't already understand. My main worry is 
hierarchically organised keys such as COFOG, POG and NUTS (region). I 
really wanted to confine the hierarchical structure to the presentation 
layer as much as possible. However, if we want to partially aggregate over 
some key, by which I mean aggregate up to some depth in the tree, then 
maybe the store needs to understand the tree structure too...

There are so many directions in which Dave and I can push this particular 
part of the design. I hope we'll find a way to keep it simple. I'm not 
sure what it is yet. :-(

 	Alistair
-------------- next part --------------
A non-text attachment was scrubbed...
Name: per-capita-experiment.ods
Type: application/vnd.oasis.opendocument.spreadsheet
Size: 16742 bytes
Desc: OpenOffice.org spreadsheet
URL: <http://lists.okfn.org/pipermail/openspending/attachments/20100414/2fd303ed/attachment-0001.ods>