[open-government] [openhouseproject] The Four "A"s of Open Government Data

David Robinson dgrobinson at gmail.com
Sun Feb 12 01:58:30 UTC 2012


This is a great point -- and I think there's a perfect A word for it:


Adaptability.


That captures the spirit of innovation that infuses so much of this work.
And if data is adaptable, it is also capable of being analyzed -- or so I
would think?

--

David Robinson
Knight Law and Media Scholar
Information Society Project
Yale Law School

JD Class of 2012
David.Robinson at Yale.edu
(202) 657-9892



On Sat, Feb 11, 2012 at 8:43 PM, Josh Tauberer <tauberer at govtrack.us> wrote:

> Last week the House Committee on House Administration (here in the U.S.)
> held a conference on legislative data and transparency. Reynold
> Schweickhardt, the committee’s director of technology policy, made an
> interesting observation at the start of the day that policy for public
> information often is framed in terms of 3 A's:
>
>    accessibility,
>    authenticity, and
>    accuracy.
>
> I thought about that over the next few hours. They are good principles.
> And yet us data geeks so often find ourselves having to start from
> scratch explaining why clean data is so important. It seems
> contradictory: if accuracy is a concept practitioners in government get,
> and if 'clean' is a type of accuracy, then there must be some
> communications failure here if we're having a hard time explaining open
> data to government agencies. (To be clear, Reynold totally gets it.)
>
>    ------------------------------**--------------
>    TLDR version: Read chapter 5 of my book at:
>    http://opengovdata.io/2012-02/**page/5/principles-open-**
> government-data<http://opengovdata.io/2012-02/page/5/principles-open-government-data>
>    ------------------------------**--------------
>
> So I was thinking that morning, what other word do we need to add to
> those 3 As to work open data in there? At first I thought about adding
> "precision". Precision is one thing we're usually asking for when we ask
> for open data. Precision is basically granularity. Compared to say a
> PDF, XHTML is more granular because it is explicit about section
> boundaries, paragraphs, identifying where in the document the important
> things are like names and dollar amounts, etc. (It is more granular with
> respect to the meaning of the document, though not its pagination.)
>
> But precision is too narrow. When Congress releases its institutional
> spending records, it does so in a PDF. That PDF has high precision ---
> it gets down practically to line items. The problem with the PDF is that
> it has low accuracy because getting it into a spreadsheet format and
> de-duping names introduces errors.
>
> But accuracy is already one of the three As. So what's missing here?
>
> The Association of Computing Machinery’s Recommendation on Open
> Government (February 2009) figured this out:
>
>  "Data published by the government should be in formats and approaches
>> that promote analysis and reuse of that data."
>>
> http://www.acm.org/public-**policy/open-government<http://www.acm.org/public-policy/open-government>
>
> Not only is it right, but "analysis" starts with the letter A. Plus, in
> order to do any useful analysis on large amounts of information, we need
> automation --- another A word. That is fate if I ever saw it.
>
> Proposing a whole 17 distinct principles of open government data (read the
> chapter!) might be, let's say, overwhelming in any practical situation. If
> we had to do with just four words, maybe these will do:
>
>    accessible,
>    authentic,
>    accurate, and
>    analyzable (using automation, because data is big these days).
>
> Analyzable gives deeper meaning to the other three words. Accuracy is too
> vague alone. You can't measure accuracy in the absence of some process. In
> the computer science world, accuracy is how often something comes out
> right. I think government documents people have considered that 'something'
> to be if a Xerox machine copies enough pixels correctly. That's not
> sufficient for analysis anymore. We can't go hiring thousands of interns to
> read all of the documents governments produce. We didn't build computers
> for nothing.
>
> With analyzable added, the meaning of accuracy is that an *automated
> computer process* will get it right. If someone says a document is accurate
> because it is a scan, I'll say that's what accurate meant in the 1960s. If
> the fourth "A" of government information is analyzable, we can redefine
> accuracy for 2012.
>
> But if you want the full 17 principles, read the rest of the chapter,
> which tackles data quality (accuracy & precision), machine processability,
> and other concepts in more detail. There's also a case study on the House
> disbursements documents, looking at whether and how it met the 17
> principles:
>
>    http://opengovdata.io/2012-02/**page/5/principles-open-**
> government-data<http://opengovdata.io/2012-02/page/5/principles-open-government-data>
>
> Thanks,
>
> - Josh Tauberer (@JoshData)
> - GovTrack.us | POPVOX.com
>
> http://razor.occams.info | www.govtrack.us | www.popvox.com
>
> --
> You received this message because you are subscribed to the Google Groups
> "Open House Project" group.
> To post to this group, send email to openhouseproject at googlegroups.**com<openhouseproject at googlegroups.com>
> .
> To unsubscribe from this group, send email to
> openhouseproject+unsubscribe@**googlegroups.com<openhouseproject%2Bunsubscribe at googlegroups.com>
> .
> For more options, visit this group at http://groups.google.com/**
> group/openhouseproject?hl=en<http://groups.google.com/group/openhouseproject?hl=en>
> .
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.okfn.org/pipermail/open-government/attachments/20120211/fcfeaabe/attachment-0002.html>


More information about the open-government mailing list