[openspending-dev] Diff for spending CSVs

Anders Pedersen anders.pedersen at okfn.org
Thu Jun 27 20:12:20 UTC 2013


Hi David,

Great work and thanks for sharing! Making it easier to update transactional
spending datasets into OpenSpending is a super useful improvement.

Anders

On 27 June 2013 12:55, Friedrich Lindenberg <friedrich at pudo.org> wrote:

> This is really cool, David!
>
> After a quick look, it looks to me like there's nothing really
> spend-specific in there: have you considered pinging @onyxfish about
> pushing this into csvkit? Would make a valuable contribution!
>
> - Friedrich
>
>
>
> On Thu, Jun 27, 2013 at 6:50 PM, David Read <
> david.read at hackneyworkshop.com> wrote:
>
>> I've written a tool to run in OpenSpending ETL for discarding the
>> parts of the CSV of spending transactions that are already loaded.
>> This is useful for the data.gov.uk work where the CSV is 4Gb, and
>> updated daily from source data, but of that, there are only a tiny
>> number of new/changed rows that need loading into the OpenSpending
>> database each day.
>>
>> Making this was a suggestion of Pudo's:
>>
>> > find out how to make diff emit the only lines that have been added and
>> use that to generate incremental spendingsource files.
>>
>> The code is in our ETL here:
>> https://github.com/openspending/dpkg-uk25k/blob/master/spend_diff.py -
>> feel free to put it into the core OpenSpending code if that makes
>> sense.
>>
>> David
>>
>
>
> _______________________________________________
> openspending-dev mailing list
> openspending-dev at lists.okfn.org
> http://lists.okfn.org/mailman/listinfo/openspending-dev
> Unsubscribe: http://lists.okfn.org/mailman/options/openspending-dev
>
>


-- 
*

Anders Pedersen

Community Coordinator  |  skype: anpehej  |  @anpe <https://twitter.com/>

The Open Knowledge Foundation <http://okfn.org/>

Empowering through Open Knowledge

http://okfn.org/  |  @okfn <http://twitter.com/OKFN>  |  OKF on
Facebook<https://www.facebook.com/OKFNetwork> |
Blog <http://blog.okfn.org/>  |  Newsletter<http://okfn.org/about/newsletter>

*

OpenSpending | http://openspending.org |
@openspending<http://twitter.com/openspending>

School of Data | http://schoolofdata.org |
@schoolofdata<http://twitter.com/schoolofdata>


*

**

*
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.okfn.org/pipermail/openspending-dev/attachments/20130627/5829ee8e/attachment.html>


More information about the openspending-dev mailing list