[openspending-dev] Overview Diagram for OpenSpending Tech Work

Friedrich Lindenberg friedrich at pudo.org
Mon Oct 20 08:28:50 UTC 2014

Hey Rufus, 

nice diagram! Basically, the main change seems to be the data store of flat-file budget data packages? Just for fun and tradition’s sake, let me play the devil’s advocate :)

While I understand that a repository of uniformly formatted datasets is kind of a cool asset, I wonder if this actually lies on any relevant user path? What is there to prevent this from becoming yet another data catalogue with little or no use cases? 

As it stands right now, I see the following main challenges for OpenSpending: 

(1) Operating the platform without an explicit revenue stream from commercial activity or providing a way for OpenSpending apps (e.g. bubbles) to run without the core platform. 

(2) Keeping the data up-to-date: I don’t want to find out who uploaded which dataset when and with which budget document, I just want to have access to the latest budget data for my country. 

(3) Developing new visualisations and modes of analysis (e.g. comparisons between city budgets, budgets over time). 

The diagram color-codes (3) to be someone else’s problem, which is fair enough. The budget data spec *could* play a significant role in this, by aligning the used classifications schemes across a set of budgets. I’m not sure whether this is intended to be addressed in the “Data Package Creation” API, but my guess is that this would really be a service on its own (I think Mark Brough is building something like this for aid sector spines). It would certainly be a great way of adding value to the data stored in OpenSpending.

(2) is also something where BDP metadata could be useful, but I’d be surprised if a well-sorted s3 bucket was really all that was needed to solve this challenge. It seems to me that this is really more of a community management issue, where we need people not to duplicate work, publish recipes for data extraction, have a clear schedule to supply updates and a review process etc. etc. In any case, I don’t feel like we should try to approach it as a file system problem, but instead think about the types of processes necessary to get it done.

(1) is the hardest part, but also the most urgent. The data store doesn’t worsen this problem (running it would probably be heroku-level/free), but the proposed architecture also doesn’t do away with the need for the expensive bits: the API and search system.

I’ve argued before that we probably wouldn’t loose much over just turning off the FTS index (perhaps with a grace period for DGU). As for the API, the question for me is whether we can flat-file the aggregator output in some standard way. The BDP can help here, but there would still need to be some sort of drilldown generator UI that builds a data package into an in-memory OLAP cube, generate lots of JSON snippets on S3 and then go drink.

If you plan to re-arrange OpenSpending around the BDP, you need to demo that it is actually useful in solving these types of challenges - proving that you can encode data in this format alone will not sell it :) 

In any case, I guess my main point is that I would find it more helpful to discuss a diagram of user activities rather than this one where all the verbs are on the fringes (Acquisition, Clean, Load, Analysis, Presentation) and the center is all nouns (API, DataStore, Write UI) :)

Don’t take this the wrong way, it’s meant with the best of intentions and I’m very excited to see BDP becoming an important tool! 


- Friedrich 

> On 15 Oct 2014, at 21:27, Rufus Pollock <rufus.pollock at okfn.org> wrote:
> Hi All,
> Based on discussion with Tryggvi here is an overview of OpenSpending tech work that summarises some of the points in OSEP 1 and OSEP 2. Would be great to get people's thoughts.
> I've also booted an issue in OSEP repo which may be best place to discuss:
> https://github.com/openspending/osep/issues/2 <https://github.com/openspending/osep/issues/2>
> Rufus
> _______________________________________________
> openspending-dev mailing list
> openspending-dev at lists.okfn.org
> https://lists.okfn.org/mailman/listinfo/openspending-dev
> Unsubscribe: https://lists.okfn.org/mailman/options/openspending-dev

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.okfn.org/pipermail/openspending-dev/attachments/20141020/6809a711/attachment-0002.html>

More information about the openspending-dev mailing list