[wdmmg-dev] Uganda data, and next steps

Carsten Senger senger at rehfisch.de
Tue Jun 7 13:37:40 UTC 2011


Hi Mark,

--On Dienstag, Juni 07, 2011 12:44:31 +0100 Lucy Chambers 
<lucy.chambers at okfn.org> wrote:

> Hi Mark,
>
> No, I don't think you are doing anything wrong - I'm cc'ing the dev
> list so we can get to the bottom of what is wrong. I think we should
> try and keep our discussions on wdmmg-dev as then I don't become the
> bottleneck ;)
>
> @ Martin - is there any update on this? Where are we at?

I'm working with the uganda data at the moment to get to a
complete import. I have found some problems in our code, but there are
still problems left. I'll get back to you when the import is working, at
latest tomorrow.

..Carsten

> Lucy
>
> On Mon, Jun 6, 2011 at 6:04 PM, Mark Brough
> <mark.brough at publishwhatyoufund.org> wrote:
>> Hi Lucy
>>
>> I just tried running the sandbox again and it's still not working. It
>> gets the data from Google Docs, and the mapping validates OK, but then
>> it gives error 500 when you try to Save. Let me know if I'm doing
>> something wrong or if we can try again sometime this week! Would be good
>> to get it all over and done with :)
>>
>> Otherwise perhaps we can just pop the JSON mapping into the existing
>> loader for the Uganda data and load it manually - it might be easier?
>>
>> Thanks
>>
>> Mark
>>
>>
>> -----Original Message-----
>> From: okfn.lucy.chambers at gmail.com [mailto:okfn.lucy.chambers at gmail.com]
>> On Behalf Of Lucy Chambers
>> Sent: 31 May 2011 13:12
>> To: Mark Brough
>> Cc: rufus.pollock at okfn.org; Friedrich Lindenberg; Karin Christiansen;
>> Rachel Rank; info at openspending.org
>> Subject: Re: Uganda data, and next steps
>>
>> Hi Mark,
>>
>> That would be wonderful - 2pm UK time, yes?
>>
>> Speak soon!
>>
>> Lucy
>>
>> On Tue, May 31, 2011 at 12:51 PM, Mark Brough
>> <mark.brough at publishwhatyoufund.org> wrote:
>>> No problem. I've just added you on Skype (I'm mark-brough), and would be
>>> great to talk this afternoon whenever you're free. Maybe around 2pm?
>>>
>>> Thanks
>>> Mark
>>>
>>>
>>>
>>> -----Original Message-----
>>> From: okfn.lucy.chambers at gmail.com [mailto:okfn.lucy.chambers at gmail.com]
>>> On Behalf Of Lucy Chambers
>>> Sent: 31 May 2011 12:40
>>> To: Mark Brough
>>> Cc: rufus.pollock at okfn.org; Friedrich Lindenberg; Karin Christiansen;
>>> Rachel Rank; info at openspending.org
>>> Subject: Re: Uganda data, and next steps
>>>
>>> Hi Mark,
>>>
>>> Sorry for the delay in getting back to you - it is taking slightly
>>> longer to boot the sandbox than we expected.  Would you be around this
>>> afternoon/ tomorrow to have a call about this?
>>>
>>> My Skype ID is (lucyfediachambers) - please let me know when would be
>>> a good time for you to talk!
>>>
>>> Lucy
>>>
>>> On Thu, May 26, 2011 at 1:17 PM, Mark Brough
>>> <mark.brough at publishwhatyoufund.org> wrote:
>>>> I think you were looking at an older file - sorry, I should clean up my
>>> CKAN
>>>> mess... The actual files are the two at the bottom (Final cleaned,
>>>> normalised version of data; Metadata for final Uganda data).
>>>>
>>>> This sounds great, and I think it would be extremely useful. Actually,
>> I
>>>> installed OpenSpending on my laptop a few days ago. I got it sort of
>>> working
>>>> with the Uganda data (including a loader) after a bit of
>> head-scratching
>>> (I
>>>> don't really know Python at all). But, I don't know how to attach the
>>>> visualisation - that would be a really useful thing to know if it's not
>>> that
>>>> hard to explain! I wanted to use the wdmmg-treemap but I couldn't find
>>>> instructions in the documentation.
>>>>
>>>> Maybe we could talk tomorrow or this evening?
>>>>
>>>> Mark
>>>>
>>>>
>>>>
>>>> -----Original Message-----
>>>> From: okfn.rufus.pollock at gmail.com
>> [mailto:okfn.rufus.pollock at gmail.com]
>>> On
>>>> Behalf Of Rufus Pollock
>>>> Sent: 26 May 2011 12:37
>>>> To: Mark Brough
>>>> Cc: Friedrich Lindenberg; Karin Christiansen; Rachel Rank;
>>>> info at openspending.org
>>>> Subject: Re: Uganda data, and next steps
>>>>
>>>> Hi Mark,
>>>>
>>>> This looks good -- though master sheet seems to ';' separated rather
>>>> than ',' separated ;-)
>>>>
>>>> What we'd like to do, if you were willing, is to arrange say 30m-1h
>>>> phone chat with you to walk through the data upload with you in the
>>>> 'driving seat' (we're trying to get it so people such as yourselves
>>>> can do the upload into openspending system directly). Would this make
>>>> sense?
>>>>
>>>> Regards,
>>>>
>>>> Rufus
>>>>
>>>> On 24 May 2011 19:39, Mark Brough <mark.brough at publishwhatyoufund.org>
>>>> wrote:
>>>>> Hi Friedrich and Rufus
>>>>>
>>>>> I've uploaded the data and metadata (the latter I think in the format
>>> you
>>>>> want, but maybe not...) to the CKAN page for the Uganda data:
>>>>> http://ckan.net/package/ugandabudget
>>>>>
>>>>> I also updated the description of the package and explained a bit more
>>>>> what
>>>>> I did to the data.
>>>>>
>>>>> Since we spoke I:
>>>>> * normalised the data finally. Disbursements and commitments are in
>>>>> separate
>>>>> rows (identified by `spending_type`). There are still separate columns
>>> for
>>>>> `amount`, `amount_dollars`, and `amount_donor`, which I think is
>>> probably
>>>>> the right place for that information to stay.
>>>>> * removed all duplicates (where `Duplicate`=1 previously) ... I've
>> kept
>>>>> these safe in case
>>>>> * added a negative Government of Uganda entry for each Budget Support
>>>>> entry,
>>>>> with all the other details about the spending the same. I also added a
>>>>> column called `bs_offset` to identify those columns that are purely
>>>>> negative
>>>>> values (Budget Support offset) to make the maths work properly.
>>>>> ** I did not change sector objective budget support into sector budget
>>>>> support as previously discussed, as I figured that if the
>> visualisation
>>>>> works at Level 1 (total) and Level 2 (SWG), it should also work at
>>> Level 3
>>>>> (Sector Objective). Let's see what it looks like now, but I think it
>>>>> should
>>>>> work. If not, I can easily change that.
>>>>>
>>>>> Let me know if there's anything else you need, but I think that should
>>> be
>>>>> it.
>>>>>
>>>>> Thanks again.
>>>>>
>>>>> Mark
>>>>>
>>>>> --
>>>>> Mark Brough
>>>>> Research Officer, Publish What You Fund
>>>>> New Address: Suite 1A, 2nd Floor, Royal London House, 22-25 Finsbury
>>>>> Square,
>>>>> London. EC2A 1DX, UK
>>>>> New Tel: +44 20 7920 6401
>>>>> Mobile: +44 7817 600 835
>>>>> Skype: publish.what.you.fund
>>>>> Mark.Brough at PublishWhatYouFund.org
>>>>>
>>>>> www.PublishWhatYouFund.org
>>>>> twitter at aidtransparency
>>>>>
>>>>> -----Original Message-----
>>>>> From: okfn.rufus.pollock at gmail.com
>>> [mailto:okfn.rufus.pollock at gmail.com]
>>>>> On
>>>>> Behalf Of Rufus Pollock
>>>>> Sent: 23 May 2011 19:45
>>>>> To: Mark Brough
>>>>> Cc: Friedrich Lindenberg; karin.christiansen at publishwhatyoufund.org;
>>>>> Rachel
>>>>> Rank; info at openspending.org
>>>>> Subject: Re: Uganda data, and next steps
>>>>>
>>>>> On 23 May 2011 18:59, Mark Brough <mark.brough at publishwhatyoufund.org>
>>>>> wrote:
>>>>>> Hi Rufus, Friedrich
>>>>>>
>>>>>>
>>>>>>
>>>>>> It was good to talk earlier and I think we agreed that:
>>>>>>
>>>>>> a)      PWYF were not happy with the idea of pro-rating budget
>> support
>>>>>> out
>>>>>> to individual projects, because it would be complicated, political,
>>> and
>>>>>> involve quite significant changes to the data. Even if we avoid
>> giving
>>>>>> Swedish aid to Defence and US aid to abortion, we will still be
>> making
>>>>>> quite
>>>>>> big assumptions about where donors would like their budget support
>>> money
>>>>>> to
>>>>>> be spent, because we don’t have this information;
>>>>>>
>>>>>> b)      PWYF suggested that another way of dealing with budget
>> support
>>>>>> was
>>>>>> to create negative values to ‘off-set’ the value of budget
>>>>>> support
>>> from
>>>>>> Government of Uganda spending;
>>>>>>
>>>>>> c)       PWYF will send data back to OKFN in CSV with these
>>>>>> off-sets included and with metadata based on the Argentina metadata.
>>>>>
>>>>> It shouldn't be directly based on Argentina -- that's an example (but
>>>>> I think that is what you meant!). We can provide an even simpler
>>>>> template if that that is useful.
>>>>>
>>>>>> d)      OKFN said that this should visualise, but that:
>>>>>>
>>>>>> a.       The visualisation of the bubbles SWG level would have to
>> have
>>>>>> something in it to prevent it looking weird, because the total value
>>> is
>>>>>> 0,
>>>>>> but one segment (Budget Support) would have a positive value and
>>> another
>>>>>> segment (Government of Uganda) would have an equal but negative
>> value.
>>>>>> Possible error of divide by zero for the positive budget support
>>> segment.
>>>>>> Maybe the visualisation can just say if the overall amount for the
>> SWG
>>> is
>>>>>> 0
>>>>>> then don’t show it?
>>>>>>
>>>>>> b.      OKFN were not sure about the idea of creating dummy
>>>>>> values to off-set the Budget Support from GoU spending, and that
>>>>>> conceptually
>>> the
>>>>>> idea
>>>>>> of budget support just disappearing at the lower levels doesn’t
>>>>>> make
>>> much
>>>>>> sense.
>>>>>>
>>>>>>
>>>>>>
>>>>>> Is that accurate? Let me know if I have missed something. Also, I’m
>>> sorry
>>>>>> my
>>>>>> explanation has been a bit unclear and I agree that it is
>> conceptually
>>>>>> strange for budget support to disappear at the lower levels. I think
>>> the
>>>>>> reason is that we’re just adding it back in, because it’s already
>> been
>>>>>> included in the Government of Uganda spending, which is why we have
>> to
>>>>>> create negative values higher up to off-set it and avoid
>>> double-counting.
>>>>>> I
>>>>>> agree it’s weird though.
>>>>>
>>>>> This seems an excellent summary Mark. We all understand this has been
>>>>> a tough dataset to get in a good analytical shape.
>>>>>
>>>>>> I had a few more questions so that hopefully I can send you the final
>>>>>> data...
>>>>>>
>>>>>>
>>>>>>
>>>>>> 1.       Should I normalise / separate out the Outturn and Planned
>>>>>> fields,
>>>>>> and then make a column called "Outturn/Planned" (or
>>>>>> commitment/disbursement)?
>>>>>
>>>>> I think that is a good idea.
>>>>>
>>>>>> 2.       Does there need to be an ID column?
>>>>>
>>>>> No. Only include an id column if there is a natural one from the
>> source
>>>>> data.
>>>>>
>>>>>> 3.       Should I remove duplicates, or is it easy to exclude them
>>> from
>>>>>> the
>>>>>> aggregations? (Where duplicate=1)
>>>>>
>>>>> In general you don't want to do aggreations -- we want non-aggregated
>>>>> data. Removing duplicates is definitely good.
>>>>>
>>>>>> 4.       Should I create a common Title field?
>>>>>
>>>>> For what (for a description)? I wouldn't invent fields too much here.
>>>>>
>>>>> rufus
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Co-Founder, Open Knowledge Foundation
>>>> Promoting Open Knowledge in a Digital Age
>>>> http://www.okfn.org/ - http://blog.okfn.org/
>>>>
>>>
>>>
>>>
>>> --
>>> Lucy Chambers
>>> Community Coordinator
>>> Open Knowledge Foundation
>>> http://okfn.org/
>>> Skype: lucyfediachambers
>>>
>>
>>
>>
>> --
>> Lucy Chambers
>> Community Coordinator
>> Open Knowledge Foundation
>> http://okfn.org/
>> Skype: lucyfediachambers
>>
>
>
>
> --
> Lucy Chambers
> Community Coordinator
> Open Knowledge Foundation
> http://okfn.org/
> Skype: lucyfediachambers
>
> _______________________________________________
> wdmmg-dev mailing list
> wdmmg-dev at lists.okfn.org
> http://lists.okfn.org/mailman/listinfo/wdmmg-dev








More information about the openspending-dev mailing list