[Okfn-ca] Fwd: [open-government] Examples of open data leading to increase in data quality?

Peder Jakobsen pjakobsen at gmail.com
Fri Aug 30 19:11:30 UTC 2013



On 2013-08-30, at 10:39 AM, Diane Mercier <diane.mercier at gmail.com> wrote:

> I would like to bring your attention to this thread on the quality of data as published on the list [open-government].
> 
> Ted Strauss focus there on the importance of " cleaning " datasets to improve their quality. This is certainly a major challenge for public organizations, since forty years, a multitude of information systems has proliferated and that without extended standardization and  openness rules. In my opinion, this is a direct and serious consequence of the use of proprietary software that we inherit today.


The important task is not cleaning the data, but extracting enough meaningful fields from or associated with those records so you can automate the creation of standard metadata information to be indexed by a search engine, then to be delivered via an API.   This is the core task of 99 % of all open data work on the planet at the moment.     As long as the metadata is good, those with an incentive to use the data will figure out a way to make sense of it.   If they don't, they probably don't need the data all that badly (Economics 101)

The vast majority of source code generated by the OKFN serves the purpose of metadata creation.  Projects for cleaning data usually whither and die on the vine, because you can't spin straw into gold unless your name is Rumpelstiltskin

Cleaning actual source data may be task as massive as curing  cancer or putting and end to global warming, but the marginal benefit a dollar spent on such an effort  is suspect, and probably unnecessary.  

Peder Jakobsen
Ottawa



-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.okfn.org/pipermail/okfn-ca/attachments/20130830/b930baf0/attachment-0001.html>


More information about the okfn-ca mailing list