[wdmmg-dev] Making sense of the OpenSpending UK Govt 25k data dump

Martin Keegan martin.keegan at okfn.org
Sat Sep 17 12:32:35 UTC 2011

(responding on-list to get the dirty laundry in public)

Chris, thanks for the error reports.

On Fri, Sep 16, 2011 at 10:43 AM, Chris Taggart <countculture at gmail.com>wrote:

I've stuck the JSON/CSV download problem in as issue #211

1) It's not clear (to me) which is the OpenSpending primary ID for the
> transaction, and how to go from that to the URL of the transaction on the
> OpenSpending Page
> Each transaction (an "entry" in our terminology) has an abstract ID in the
form of a hash. From the main dataset page,
http://www.openspending.org/dataset/ukgov-25k-spending , click on a "Full
entry" link, and the ID is in the URL, e.g.,
http://www.openspending.org/entry/f3ec400ee899db230d1b3a3d22e9134d3c94ff67 .
The datasets also inherently have a candidate key, but this can span
multiple dimensions and so doesn't reduce to a URL well.

> 2) The department names are a bit of a mess. I'd like there to be some
> unique ID for the department, which ties in to the entity page for that
> department, but this doesn't seem to be there. More problematic is that
> there are many different representations of departments, e.g. 'DCMS',
> "Department for Culture Media and Sport", "Department for Culture, Media and
> Sport", "Department for culture, media and sport", "Department for Culture,
> Media & Sport", "The Department For Culture, Media, and Sport".
> I'm guessing these have been normalised for the actual site, and it's
> possible for me to do that, but again, a bit useless without a way of
> linking to the OpenSpending entity page
> Ah, do we have multiple pages per entity?


> 3) There are lots of entries which are not departments, and some entities
> that are local govt (which means we'd duplicate with the OpenlyLocal data
> we're importing).
> For example, these are some of the non-departments in the departments
> column: "OVERHEAD DIVISION", "Corporate Planning","Assets Under
> Construction", "Departmental Family", "Supplies And Procurement","Service
> Improvement", etc .(There are also null entries, which I read as meaning
> there was nobody paying the money, and a few bits of cruft, e.g."s & Local
> Government").
> And these are the local govt ones: "BASSETLAW DISTRICT
> COUNCIL", "Gloucester City Council","Wiltshire Council", "PRESTON CITY
> It might be that this is not the best place to download the data, or that
> I'm missing something, but seems crazy for OpenCorporates to duplicate the
> OpenSpending stuff, or spend time cleaning it up and in the process making
> it impossible to link back to the correct OpenSpending entry.
> Can we speedily fix any of these probs?
> Chris
> --
> -------------------------------------------------------
> OpenCorporates :: The Open Database of the Corporate World
> http://opencorporates.com
> OpenlyLocal :: Making Local Government More Transparent
> http://openlylocal.com
> Blog: http://countculture.wordpress.com
> Twitter: http://twitter.com/CountCulture
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.okfn.org/pipermail/openspending-dev/attachments/20110917/4492207a/attachment.html>

More information about the openspending-dev mailing list