[openspending-dev] Sanity checking numbers on Minas Gerais' satellite site

Stefan Wehrmeyer stefan.wehrmeyer at okfn.org
Tue Feb 19 00:06:14 UTC 2013


Good work and thanks for working through this!

Cheers
Stefan

On 19.02.2013, at 00:43 , Vitor Baptista <vitor at vitorbaptista.com> wrote:

> Hey,
> 
> I guess I've finally found the real issue. This is a very subtle bug. What was happening is that, in Aggregator.addBreakdown, line 250-251, it calls
> 
> 
> breakdown_value = entry[breakdown]
> self.toNode(breakdown_value)
> 
> And, inside toNode, it builds the node from that value. But, in some cases, it will simply change the value passed as argument and return it back (line 278)
> 
> 
> node = value
> 
> Some objects points to the same region object, not sure why. So, when the line 274 is run:
> 
> 
> breakdown_node[self.config.measure] = breakdown_node[self.config.measure] + entry[self.config.measure];
> 
> It's changing the region for a bunch of entries, unrelated to each other. This is what's screwing up with our data.
> 
> The fix is to change line 278 to:
> 
> 
> $.extend(true, node, value);
> 
> So, instead of changing value itself, we're using a new object.
> 
> 
> > var total = 0; for (var i in nodes.root.breakdowns) { total += nodes.root.breakdowns[i].amount; };
> 61554809596.11002
> > nodes.root.amount
> 61554809596.11001
> 
> We're losing some precision with all these floating-point arithmetic (this should be fixed, IMO), but at least this problem is solved.
> 
> Cheers,
> Vítor.
> 
> 2013/2/15 Vitor Baptista <vitor at vitorbaptista.com>
> Hey Stefan,
> 
> I've found where's the problem is, but not sure what the fix would look like yet. Any ideas?
> 
> Cheers,
> Vítor.
> 
> 
> 2013/2/15 Stefan Wehrmeyer <stefan.wehrmeyer at okfn.org>
> Hi Vitor,
> 
> thanks for tracking that down.
> I also noticed the issue a while ago when sanity checking some treemaps but couldn't get to the bottom of it.
> There should definitely be (regression) tests for this kind of stuff.
> 
> Cheers
> Stefan
> 
> On 14.02.2013, at 23:15 , Vitor Baptista <vitor at vitorbaptista.com> wrote:
> 
> > Hey Friedrich,
> >
> > Not sure what you mean. But I guess I'm a bit closer founding the culprit.
> >
> > I've added a counter into Aggregator.addBreakdown call, and it's called once per breakdown or drilldown. In Minas Gerais' case, 4 times. There, we end up double (or triple) counting amounts that end up being added to the root node. If I comment this line, https://github.com/openspending/openspendingjs/blob/master/lib/aggregator.js#L224, the problem disappears, but I'm still wondering what's the use of that, and what are the complications.
> >
> > Cheers,
> > Vítor.
> >
> > 2013/2/14 Friedrich Lindenberg <friedrich.lindenberg at okfn.org>
> > Did you consider the geographic breakdown? Maybe the aggregator only includes sums which are allocated to a region or otherwise discards some of that dimension?
> >
> > - Friedrich
> >
> >
> > On Thu, Feb 14, 2013 at 8:34 PM, Vitor Baptista <vitor at vitorbaptista.com> wrote:
> > If you open http://minasgerais.playground.openspending.org/bubblemap.html, and set a breakpoint into aggregator.js, line 183, you can see:
> >
> >
> > > nodes.root.amount
> > 61554809596.11001
> > > total = 0; for (i in nodes.root.breakdowns) { total += nodes.root.breakdowns[i].amount; }
> > 73659290793.70004
> > > total = 0; for (i in nodes.root.children) { total += nodes.root.children[i].amount; }
> > 61554809596.11003
> >
> > The difference between nodes.root.amount and the sum of nodes.root.children amounts seems to be just a precision error. But the breakdowns have a huge difference. I'm trying to understand what's happening, but haven't had much luck.
> >
> > My first hypothesis was that we're counting a few values twice. I thought about it because, if you check:
> >
> >
> > > data.drilldown[0].region.id
> >
> > "1"
> >
> > I thought that Aggregator.toNode could change that to "root__1", which would conflict with a cofog1 ID. But doesn't seems to be the case.
> >
> > Any ideas?
> >
> > Cheers,
> > Vítor.
> >
> >
> >
> > --
> > Friedrich Lindenberg | OpenSpending & OKFN Labs | @pudo
> > http://openspending.org | http://okfnlabs.org | http://pudo.org
> >
> > _______________________________________________
> > openspending-dev mailing list
> > openspending-dev at lists.okfn.org
> > http://lists.okfn.org/mailman/listinfo/openspending-dev
> > Unsubscribe: http://lists.okfn.org/mailman/options/openspending-dev
> 
> 
> 





More information about the openspending-dev mailing list