[wdmmg-discuss] CRA 2010: progress report [was: CRA 2010: description and questions]
William Waites
william.waites at okfn.org
Wed Aug 18 12:33:30 UTC 2010
On 10-08-18 12:30, Anna Powell-Smith wrote:
> Hi all,
>
> I'm now going through the two CRA datasheets, preparatory to loading
> them into the datastore. Here are my findings so far - mostly good news.
Some very good news!
> Question for Will (or Lisa): I need to make a few changes to the
> cofog_map.json file in the CRA bitbucket package that maps COFOG
> classifiers. Is it OK just to edit this file and check it back in?
I think as long as the changes aren't incompatible
with what we need for 2009 it should be just fine.
For my part, I'll not be able to make the meeting
today, need to go meet the family arriving from
Cambridge at the train station...
What I've done so far is make an RDF representation
of the CRA for 2009, you can see the main entry point
and navigate around at http://purl.org/okfn/dataset/cra/2009
In preparation for the 2010 data, and to help separate
the data from the wdmmg application, I've started
factoring out the CRA loading machinery from the wdmmg
repository. See http://bitbucket.org/ww/ukgov_treasury_cra
As I do this, I'm also documenting things in a more
formal way, see http://packages.python.org/ukgov_treasury_cra/
In particular, hopefully the CRAReader module can be
subclassed or adapted to deal handle the 2010 data. I
am imagining that the SQL wdmmg code would just depend
on this package and would have a module that just takes
cleaned rows as dictionaries and passes them into the
model creation stuff.
This is also a special python package that is also an egg
format datapkg index (see recent changes to datapkg).
If you get datapkg from mercurial you should be able
to do, for example::
pip install hg+http://bitbucket.org \
/ww/ukgov_treasury_cra#egg=ukgov_treasury_cra
datapkg list egg://ukgov_treasury_cra
datapkg install egg://ukgov_treasury_cra/cra2009 \
file://tmp
The CRALoader class takes care of installing the
data package to wherever the cache is configured
in the config file...
That's all for now.
Cheers,
-w
--
William Waites <william.waites at okfn.org>
Mob: +44 789 798 9965 Open Knowledge Foundation
Fax: +44 131 464 4948 Edinburgh, UK
RDF Indexing, Clustering and Inferencing in Python
http://ordf.org/
More information about the openspending
mailing list