[wdmmg-dev] Proposal: Architectural Changes - Part I. (long)

Stefan Urbanek stefan.urbanek at gmail.com
Fri Jul 8 13:50:53 UTC 2011


On 8.7.2011, at 0:48, Martin Keegan wrote:

> 
>> * main OpenSpending packages should focus on "core business" - that is providing analytical insight into
>>  spending data, either through web based interface, search engine or API. 
>> * analytical dataset is read-only for web application
> 
> Ergo, we need a writable db for annotations, flags, etc, and unique ids for entries.

Sure, this is kept in mind. Writeable DB is part of the web application. And as we discussed - to make it work we need proper DB object identification with reconstructable keys.


> 
>> * most of master data sources (classifications, lists, enumeration) should be available as CKAN packages as
>>  well, mainly list of entities, classfications ? *Reasons*: same as reasons for dataset source being stored
>>  in CKAN; introduces better reusability of classifications at source level ? potential data providers can be
>>  pointed to open and public existing classifications to make their data comply with OpenSpending
>>  requirements.
> 
> I do worry how we're going to deal with exchange rate datasets: we'll want a list of (currency, currency, rate, date) 
> tuples, but only for the dates we care about. Does that count as a real dataset?
> 

Exchange rates are kind of master data. We can have it as separate CKAN package or we can have some "OpenSpending Core Master Data" package - master data not related to any particular dataset, but needed in general by CKAN. What do you think?

We can discuss it when we will discuss master data in more details.

>> Note: search engine might or might not be part of OLAP module, recommended is indexing out-of OLAP module
>> with references to OLAP objects (multidimensional aggregates, detailed facts)
> 
> Yes - what are your (and other people's) views on Solr and how it fits in?
> 

No thoughts about that yet from my side. I haven't used Solr, just Sphinx. Sphinx has coupe of limitations for index attributes and also provides little bit limited information about results. For example if you index multiple fields (columns) in a record (row), it does not return in which field is the result. Therefore for hierarchical multidimensional store I had to create special index with dimensional information and that dimension index was indexed by sphinx.

> Mk

s.





More information about the openspending-dev mailing list