[okfn-labs] Bad Data: real-world examples of how *not* to do data

Ivan Begtin ibegtin at gmail.com
Sun Nov 24 10:21:06 UTC 2013


Hi Rufus!

One more example of bad data.

Russian Government Tax service published dataset of all local tax rates for
Russian municipalities - http://nalog.ru/ru/opendata/p9/

direct download -
http://nalog.ru/opendata/7707329152-taxrates/data-1-structure-1.zip

Data published as one huge XML file with size about 500 MB and this XML
file as just one single string. No linefeeds at all.
So DOM parsers can't handle it and even not every SAX parser helps.

Best Regards,
  Ivan



2013/11/22 Rufus Pollock <rufus.pollock at okfn.org>

> Hi All,
>
> I wanted to flag a new mini-project:
>
> http://okfnlabs.org/bad-data/
>
> The idea of "Bad Data" is to provide real-world examples of how *not* to
> publish data. It showcases the poorly structured, the mis-formatted, and
> the just plain ugly.
>
> This is less about being critical and more about educating - by providing
> examples of how not to do something we can help show how to do it right.
>
> Here are a couple of the examples already up there:
>
>    - A poorly structured CSV on tube usage from London Datastore<http://okfnlabs.org/bad-data/ex/tfl-passenger-numbers/>
>    - An ASCII spreadsheet (with merge cells!) from US Bureau of Labor
>    Statistics <http://okfnlabs.org/bad-data/ex/bls-us-employment/>
>
> *New examples are very welcome*, instructions on how to submit them here:
> http://okfnlabs.org/bad-data/add/
>
> Rufus
>
>
> _______________________________________________
> okfn-labs mailing list
> okfn-labs at lists.okfn.org
> http://lists.okfn.org/mailman/listinfo/okfn-labs
> Unsubscribe: http://lists.okfn.org/mailman/options/okfn-labs
>
>


-- 
С уважением,
  Иван Бегтин

Директор НП "Информационная культура"
email: ibegtin at infoculture.ru
phone: +7 499 500 96 58, +7 910 426 68 83
website: http://infoculture.ru
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.okfn.org/pipermail/okfn-labs/attachments/20131124/e08aaff1/attachment-0004.html>


More information about the okfn-labs mailing list