[okfn-labs] Help clean up the UK spending data!

Thomas Kluyver takowl at gmail.com
Tue Jul 31 15:11:43 BST 2012


On 31 July 2012 14:30, Friedrich Lindenberg
<friedrich.lindenberg at okfn.org> wrote:
> e) Expense Type Code != Expense Type - try to keep code fields and
> text fields separate.

I don't know if it's easy with the platform, but showing a sample of
the data in the column would make it easier to distinguish this kind
of thing.

> I appreciate that this is an awkward process, but have come to the
> conclusion that using more automation will just give us bad data. At
> the moment, we've go around 4.3mio records extracted - with your help
> we can bring this up to 6mio.

I've gone through 20 or so, and I can't help thinking that a bit more
automation wouldn't go amiss - most of them differed from the target
only by having a space between words - "Expense type" instead of
"ExpenseType".

Good luck,
Thomas



More information about the okfn-labs mailing list