[openspending-dev] Diff for spending CSVs
Anders Pedersen
anders.pedersen at okfn.org
Thu Jun 27 20:12:20 UTC 2013
Hi David,
Great work and thanks for sharing! Making it easier to update transactional
spending datasets into OpenSpending is a super useful improvement.
Anders
On 27 June 2013 12:55, Friedrich Lindenberg <friedrich at pudo.org> wrote:
> This is really cool, David!
>
> After a quick look, it looks to me like there's nothing really
> spend-specific in there: have you considered pinging @onyxfish about
> pushing this into csvkit? Would make a valuable contribution!
>
> - Friedrich
>
>
>
> On Thu, Jun 27, 2013 at 6:50 PM, David Read <
> david.read at hackneyworkshop.com> wrote:
>
>> I've written a tool to run in OpenSpending ETL for discarding the
>> parts of the CSV of spending transactions that are already loaded.
>> This is useful for the data.gov.uk work where the CSV is 4Gb, and
>> updated daily from source data, but of that, there are only a tiny
>> number of new/changed rows that need loading into the OpenSpending
>> database each day.
>>
>> Making this was a suggestion of Pudo's:
>>
>> > find out how to make diff emit the only lines that have been added and
>> use that to generate incremental spendingsource files.
>>
>> The code is in our ETL here:
>> https://github.com/openspending/dpkg-uk25k/blob/master/spend_diff.py -
>> feel free to put it into the core OpenSpending code if that makes
>> sense.
>>
>> David
>>
>
>
> _______________________________________________
> openspending-dev mailing list
> openspending-dev at lists.okfn.org
> http://lists.okfn.org/mailman/listinfo/openspending-dev
> Unsubscribe: http://lists.okfn.org/mailman/options/openspending-dev
>
>
--
*
Anders Pedersen
Community Coordinator | skype: anpehej | @anpe <https://twitter.com/>
The Open Knowledge Foundation <http://okfn.org/>
Empowering through Open Knowledge
http://okfn.org/ | @okfn <http://twitter.com/OKFN> | OKF on
Facebook<https://www.facebook.com/OKFNetwork> |
Blog <http://blog.okfn.org/> | Newsletter<http://okfn.org/about/newsletter>
*
OpenSpending | http://openspending.org |
@openspending<http://twitter.com/openspending>
School of Data | http://schoolofdata.org |
@schoolofdata<http://twitter.com/schoolofdata>
*
**
*
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.okfn.org/pipermail/openspending-dev/attachments/20130627/5829ee8e/attachment.html>
More information about the openspending-dev
mailing list