[OpenSpending-discuss] How Spending Stories Spots Errors in Spending Data

Wed Dec 7 23:25:05 UTC 2011

Hi,

On Tue, Dec 6, 2011 at 11:31 AM, Alex (Maxious) Sadleir
<maxious at gmail.com> wrote:
> Definitely more on the data mining side! I think there are some
> algorithms/statistical techniques that any financial dataset could
> benefit from like Gini coefficent/Pareto distribution/Benford's Law.
> This could also serve to introduce people to a slightly more advanced
> world of data science if it's presented nicely.
>
> Javascript sounds interesting especially thinking about the things
> people have managed to do with CouchDB views. The wisdom of crowds via
> sharing snippets would be good too - I could spend alot of time
> writing inefficient heuristics on my own ;)

I've hacked up a really basic version that uses the loading server to
run arbitrary scripts in a sandboxed V8 (chrome JS engine):

https://github.com/pudo/osmine (Example: http://i.imgur.com/mjOT8.png)

The fatal flaw here is that we don't have any way of storing the
results, so at the moment this all ends up on the server console. We
should really get "bookmark lists" for entries to store results in
those. And make the code into fixed snippets or GitHub repo links
(what is better?). We already have a log reporting UI (for load
errors), that should apply across to this.

- Friedrich