[openspending-dev] [wdmmg-dev] Revised model proposal
friedrich.lindenberg at okfn.org
Sun Nov 13 08:48:47 UTC 2011
what we're doing is so similar, its really funny. Is there really no
way we can collaborate on at least the data modeling format (see
original message) and the APIs (see
http://localhost:5000/help/api-aggregator.html)? It just seems like a
waste of resources to do the exact same thing twice and never get to
do the cool stuff (http://wiki.openspending.org/Future_plans).
On Sat, Nov 12, 2011 at 12:42 PM, Thiago Rondon <thiago.rondon at gmail.com> wrote:
> On Fri, Nov 11, 2011 at 6:15 AM, Friedrich Lindenberg
> <friedrich.lindenberg at okfn.org> wrote:
>> Hi Thiago,
>> nice to hear from you, how are you?
> I'm fine, and you ?
> // Now, I see that I just sent for you the email, not to the list. :-)
>> On Fri, Nov 11, 2011 at 2:24 AM, Thiago Rondon <thiago.rondon at gmail.com> wrote:
>>> I have one question about the schema of openspending. I saw in the
>>> git repository, that you use solr, is that correct ? Do you use some
>>> schema with tree ? For example, nested or adjacent list ?
>> Yes and no. We use schema exclusively for full-text search, because
>> whatever the Postgres people say, SQL-based FTS smells. But the
>> primary data store is PostgreSQL, this is where all visualizations are
>> generated from.
>> But overall, we're not storing a tree-like structure at all, but
>> simply the source transactions. Trees are generated by querying this
>> with GROUP BY aggregations, and WHERE filters: e.g. give me the SUM of
>> all transactions, grouped by primary classification, then grouped by
>> the responsible government department. This means we don't have to fix
>> the breakdowns we want to do at data load time, but can get this
>> dynamically (see http://wiki.openspending.org/API).
> API has a nice documentation. :-) It's what we try to here now, for
> paraondefoiomeudinheiro "version 2". Now we build an api for everyone
>> Generating a tree from a result set of multiple group by statements is
>> relatively easy - the treemaps do it via a core plugin, but the
>> bubbletree is already doing it client-side in the browser - and I hope
>> the next iteration of the treemap will also do this.
>>> Because today, I use a tree schema as backend, based in DBIx::Class
>>> (ORM) with MySQL.
>>> But I'm looking for another way to store data. For example:
>>> * Solr with tree schema ; (It can make our life easier)
>> As I said, I like Solr a lot for search but its aggregation
>> capabilities (stats, not necessarily facets) break down a bit
>> somewhere just north of 2mn documents on our 4G core. I'd be very
>> careful with using solr as the primary backend...
> Ok, I got it.
>>> * PostgreSQL with adjacent list (dbms); (We can have more control)
>> I'd like to learn more about how this would look schema-wise: are you
>> going to use PG as a full graph store? How would you put monetary
>> information into that best?
> I try to do something like a OLAT Cube, or a schema with a fact and
> dimensions to be fast. I think monetary information can be store with
> "money datatype" of postgresql, what do you think about this ?
> Because I'll receive a lot of information with different schemas, but
> all the datasets need to have hierarchy (adjecent list), timestamp (period)
> category and value (in money)
>>> * Elasticsearch with stateless meta-data. (It can be hard to manage data)
>> I love elastic search and there's a kind of long-term plan to push all
>> our solr stuff at OKF over to elastic, but in terms of functionality
>> for this kind of thing it seems roughly equivalent to Solr...
> Yes, I understand.
> We're working in the version 2 of our website,
> To make a simple way with deploy, we are using now Jenkins, look:
> Ps.: You can reply to the list, if you want.
> -Thiago Rondon
Open Knowledge Foundation
Promoting Open Knowledge in a Digital Age
http://www.okfn.org/ - http://blog.okfn.org/
More information about the openspending-dev