[open-government] Examples of open data leading to increase in data quality?

Ivan Begtin ibegtin at gmail.com
Tue Aug 27 17:54:54 BST 2013


Hi Jonathan!

I know a few examples. Most of them based on Russian domestic open data.

1. Russian government opened up all procurement data including all
government tender procedures, contracts, customers and suppliers data. It's
about 21 millions of XML documents, 20 GB compressed and 400 GB
uncompressed data.

We processed all this data and generated a few reports in 2009 year, 2011
and this 2013. We found that about 1.3% of contracts had errors in
suppliers data, about 5% errors with contracts descriptions and so on. Most
common problems are errors with organizations identification codes and so
on. Our report in 2011 year caused a few meetings with officials from
Russian Federal Treasury (state agency responsible for procurement
disclosure). Result of these meetings is that treasury launched own project
to monitor data quality and they fixed most critical errors.

Our report available here -
http://www.slideshare.net/ivbeg/quality-report-2011 (in Russian). If you
need it for translation I could send you MS Word document too.


2. This year, alltogether with Russian High School of Economy and
Association of procurement institutions, we made deep data analysis and
research for such issue as "blind procurement".

"Blind procurement" is unique Russian issue and it's all about connection
between corruption and data quality.

Probably you know that some spammers use "word distortion techniques" to
bypass spam filters. They replace letters with digits that looks like
similarly or do mistakes and so on.

So some Russian _government officials_ used similar techniques to hide
tender announcements from search engines. Since Russian procurement data
and procedures concentrated in only one website - zakupki.gov.ru, they used
limitations of it's search capabilities and used "word distortion
techniques".

We found about 10 tricks to hide the data:
- replace cyrillic chars with latin chars that looks like the same
- replace latin chars with cyrillic chars for products with latin name,
like "Hyundai" or "Microsoft"
- unreasonable use of dash char "-" inside the word. Like "The ten-der to
bu-y mi-lk" instead of "The tender to by milk"
- large amount of mistakes. Up to 10 mistakes per sentence
- and so on

After our research and report published Russian prosecution agencies and
procurement control agencies launched audit of all government bodies
involved in such activities.

Actually this report is similar to our previous research that we did from
2009 to 2012 years, but this time we found about 12 000 cases of "word
distrortion techniques" so here is the result.

More info (in Russian) here -
http://naiz.org/upload/iblock/faa/faac377dabf5757d79a599715b888f81.pdf and
here - http://xn--80abeamcuufxbhgound0h9cl.xn--p1ai/events/5508098/

3. Moscow city government published about 170 datasets with geo coordinates
at Moscow opendata portal - http://data.mos.ru

After quick examination our colleagues from Russian OpenStreetMap community
found that data has many errors and mistakes. Actually it was list of about
22 issues with data including errors with CSV files, wrong geo coordinates
and so on.

They published research here (in Russian) -
http://gis-lab.info/qa/data-mos.html and quite soon most of issues were
solved by Moscow state officials.

Best Regards,
   Ivan Begtin







2013/8/27 Jonathan Gray <jonathan.gray at okfn.org>

> Hi all,
>
> I wonder if anyone has any good examples or evidence of how open data (or
> - more generally - publicly released machine readable data) has led to an
> increase in data quality?
>
> All the best,
>
> Jonathan
>
> --
>
> Jonathan Gray
>
> Director of Policy and Ideas  | *@jwyg <https://twitter.com/jwyg>*
>
> The Open Knowledge Foundation <http://okfn.org/>
> *
>
> Empowering through Open Knowledge
>
> okfn.org  |  @okfn <http://twitter.com/OKFN>  |  OKF on Facebook<https://www.facebook.com/OKFNetwork> |
> Blog <http://blog.okfn.org/>  |  Newsletter<http://okfn.org/about/newsletter>
> *
>
> _______________________________________________
> open-government mailing list
> open-government at lists.okfn.org
> http://lists.okfn.org/mailman/listinfo/open-government
> Unsubscribe: http://lists.okfn.org/mailman/options/open-government
>
>


-- 
С уважением,
  Иван Бегтин

Директор НП "Информационная культура"
email: ibegtin at infoculture.ru
phone: +7 499 500 96 58, +7 910 426 68 83
website: http://infoculture.ru
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.okfn.org/pipermail/open-government/attachments/20130827/6bb1bdb6/attachment-0001.htm>


More information about the open-government mailing list