[wdmmg-discuss] Detailed spending data for London

Donovan Hide donovanhide at gmail.com
Wed May 19 14:52:34 UTC 2010


Hi Alistair,

looks like the www.london.gov.uk page is merely a facade to the old
legacy.london.gov.uk page. I think they've basically got two wonky
CMS'.  I'll stick with the non-legacy one. The scraper is just
extracting the href's from the anchors anyway, so it is actually using
both CMS'.

Rather than manually fix the dates, I think I'll try and get them to
fix the data and then rescrape it if they do. Questions of provenance
arise if you make arbitrary decisions on data that might be used to
form powerful conclusions!!! That's why I've preserverd the link and
rowNumber: to prove where the data came from. Additionally, the
guessed date might be wrong - it could be perfectly reasonable that an
invoice is from May 12th is paid on June 31st, but appears on the July
CSV. Which is the correct date?

Cheers,
Donny.

On 19 May 2010 15:42, Alistair Turnbull <apt1002 at goose.minworks.co.uk> wrote:
> Great stuff, Donny.
>
> A couple of observations:
>
> 1. There are at least two different index pages for this data set:
>
>        http://legacy.london.gov.uk/gla/expenditure/index.jsp
>
>  http://www.london.gov.uk/who-runs-london/greater-london-authority/expenditure-over-1000
>
> You've used the second one, which appears to have an extra four months'
> data. That's probably correct, but it might be worth establishing which one
> is the index that's going to receive best long-term support.
>
> 2. Especially for the months without a "Date" column, it would be useful to
> record at least the month in which the spending happened. There seems to be
> no good automatic way of doing this. Looks like a manual job. :-(
>
>        Alistair
>
> On Wed, 19 May 2010, Donovan Hide wrote:
>
>> Cleaned version here:
>>
>> http://scraperwiki.com/scrapers/show/greater-london-assembly-expenditure/
>>
>> Seems like the CSV's are prepared manually at the end of the month,
>> they are very inconsistent in their formatting!
>>
>> Cheers,
>> Donny,
>>
>> On 19 May 2010 14:22, Rufus Pollock <rufus.pollock at okfn.org> wrote:
>>>
>>> The Greater London Authority has just made detailed data on
>>> expenditure over £1000 available (via the Guardian Data Blog [1]):
>>>
>>>
>>> <http://www.london.gov.uk/who-runs-london/greater-london-authority/expenditure-over-1000>
>>>
>>> Have created a CKAN package:
>>>
>>> <http://ckan.net/package/gla-spending>
>>>
>>> And we'll may have a stab at loading this data into a new slice in the
>>> data store <http://data.wheredoesmymoneygo.org/>
>>>
>>> Rufus
>>>
>>>
>>> [1]:<http://www.guardian.co.uk/news/datablog/2010/may/19/greater-london-authority-spending-analysis>
>>>
>>> _______________________________________________
>>> wdmmg-discuss mailing list
>>> wdmmg-discuss at lists.okfn.org
>>> http://lists.okfn.org/mailman/listinfo/wdmmg-discuss
>>>
>>
>> _______________________________________________
>> wdmmg-discuss mailing list
>> wdmmg-discuss at lists.okfn.org
>> http://lists.okfn.org/mailman/listinfo/wdmmg-discuss
>




More information about the openspending mailing list