[open-government] Defining Open Government Data?
Jonathan Gray
jonathan.gray at okfn.org
Wed Oct 20 09:24:37 UTC 2010
On Wed, Oct 20, 2010 at 8:39 AM, Tim Davies <tim at timdavies.org.uk> wrote:
> Hello all,
>
> This is a really useful discussion. Some thoughts below...
> On the question of a definition
> I'm sceptical about the value of a solid-line definition which says some
> things are in - some things are out - when it comes to open government data.
That would be exactly what the definition would be. ;-)
Like F/OSS definitions. About baseline compliance. I think that part
of the value here would be in being able to say you must do X and Y
*at a bare minimum* to make sure your stuff (government data) is open.
> The Open Definition already provides a solid definition of a particular
> notion of openness - and at most a short FAQ on how this applies to
> government data should cover providing a sense of a gold-standard for
> something being 'formally' open.
Exactly, but it is isn't yet widely adopted as a standard. I hope in
this discussion we can draw out any domain specific
points/assumptions/etc. A bit like the Panton Principles build on
opendefinition.org for science:
http://pantonprinciples.org/
> Models like the 'Five Stars of Open Linked Data'
> (http://inkdroid.org/journal/2010/06/04/the-5-stars-of-open-linked-data/)
> are far more useful in both helping people assess their current openness,
> and providing a motivational structure for making data available.
> An adapted version of the 5-stars, talking about licenses in place of
> linked-data etc., and adding a 'social openness' step at the end may be one
> route to a definition.
Hmm.. I would almost be tempted to say we should separate this into:
(i) are we talking about 'open government data' or not? (definition/standard)
(ii) are we doing this *well* -- e.g. is it connected to other
resources, are we doing 'social stuff', is there good documentation,
etc. (principles/guidelines/star ratings)
> A good definition for the end-user should be able to be re-formulated into a
> set of questions, such as:
>
> (License) Is your data published under a license that allows it to be
> re-used by anyone, or placed into the public domain so there are no
> restrictions on re-use?
>
> (Format) Is your data accessible to humans and machines in a structured way?
> (As a good rule of thumb, if it's possible for a re-user to take a copy of
> your data, load it into standard software, and edit that copy easily, it's
> machine-readable).
>
> (Social) Have you worked to ensure that citizens and other potential
> re-users of your data have access to the additional information, tools and
> resources that they would need to make effective use of your data?
>
> Social openness
> The last point there is my attempt at some sort of social openness clause.
> Clearly it isn't unambiguous (what's 'effective', or enough effort in
> 'working to ensure'?) but it tries to capture what might be the steps
> governments (and wider communities are encouraged to take) to ensure the
> data is usable and used in practice.
> The practical openness of any dataset is not a property only of that
> dataset, but also of the tools-chains available; access to knowledge and
> skills; access to meta-data; etc. - and government clearly has a role to
> play in promoting access to and development of those resources - but the
> responsibility is shared with civil society / citizens / communities /
> business.
Fully agree that social stuff is important -- this is mostly what I work on. ;-)
At the same time I can't help think it might be useful to pick out
properties of the data for certain practical purposes. If government
doesn't do certain things (adding correct metadata, connect with other
data sources, allow commenting, ...) then others (like mySociety,
OpenlyLocal, Sunlight Foundation) can possibly do it. If its not under
an open license, then legally speaking, no-one can do it. If a dataset
is in a weird format but under an open license, then someone can
convert it to something more useful (like numerous people have done
with COINS or Eurostat).
I guess the main danger about having a definition that is too basic
and sparse is that we lower expectations -- and people think once they
comply with the definition (e.g. once that they have done the legal
and technical stuff) that is enough, and they don't need to do
anything else. I guess my feeling is that this is a question of
strategy, and we should work to build a culture (in advocacy, in
policy making) where its clear that this often isn't enough (e.g. with
talks, manuals, guidance, websites, ...). This also applies to things
like funding prototypes, to having data registries and so on. There
isn't a universally applicable recipe to getting things right, but we
can certainly provide guidance and instruction based on
evidence/experience from around the world.
I think the main danger not having a definition is that people will
start applying the term 'open government data' to material with
restrictive terms of use, or to services where material is only
available via an API (e.g. with limited number of queries, or onerous
contractual obligations or registration procedures).
Finding a path (or at least plotting several possible paths) between
Scylla and Charybdis is exactly what I'd like to try to achieve in
this discussion. ;-)
> The one point in here which might be slightly separate, around providing
> 'additional information' (in practice, meta-data and guides/handbooks
> etc.).
> Would a separate meta-data term of the 'definition' be useful?
> Different sorts of openness: commercial and civic?
> I'm sure it's a debate that's been over many times, and one it seems OKF
> have a fairly settled position on - but I do think it's worth the
> distinction between: 'civic openness' and 'commercial openness' being made -
> particularly for the broadest possible use of a definition.
> If a government does not wish to make data available for commercial re-use,
> but accepts free access to machine-readable data for citizens to use in
> non-commercial ways - that has significant potential benefits for democracy
> - and should be recognised as an open data policy; albeit only providing
> 'civic/democratic openness' and clearly shown to fail on 'commercial
> openness'.
Hmm, I'm *really* not convinced that we should call material released
under non-commercial licenses 'open government data'. That doesn't
mean it isn't valuable or important!
What do we gain by counting NC stuff as open government data?
Recognition for civil servants who have run up against
'insurmountable' internal barriers? What do we lose? Its much harder
to convince people to move to non-NC licenses from NC licenses if they
are convinced that both are fully open. This is really important for
an interoperable data commons (e.g. combining things with Wikipedia,
Open Street Map, etc). Otherwise we have a two tier eco-system: one
tier for commercial operators and one for everyone else. Also if
companies have to pay for data licenses for 'open government data' are
they going to be inclined to share back their modifications with
everyone else? Suspect this could have some pretty undesirable
consequences for the open data 'ecosystem' down the line.
> Of course - this moves from single unified definition more towards
> 'framework' - but, as above, my sense is that definitional frameworks,
> rather than exclusionary definitions, are a better route to go...
> Other points
> One other point that might have a place in a framework would be around
> 'Making Connections'. Perhaps (connected) is the top of a five-stars of open
> government data?
>
> (Connecting Data to Information) When you publish information (charts /
> tables / reports) based on your data, do you provide a clear link back to
> the original data, and any other information re-users would need to
> understand how the information was generated?
>
> (Drawing
> on http://practicalparticipation.co.uk/odi/report/2010/2-3-data-and-information/)
>
> (Connecting Data) Do you use linked-data approaches to make connections
> between your data and other datasets.
>
> Hope these are useful inputs...
Yes -- finding this discussion really useful. In particular we should
articulate why we want a definition in the first place. Added notes
for a preamble:
http://opengovernmentdata.okfnpad.org/definition
> All the best
> Tim
> -
>
> +44 (0)7834 856 303
> @timdavies
> http://www.timdavies.org.uk
>
>
> On Tue, Oct 19, 2010 at 9:32 PM, Jonathan Gray <jonathan.gray at okfn.org>
> wrote:
>>
>> On Tue, Oct 19, 2010 at 6:49 PM, Ton Zijlstra <ton.zijlstra at gmail.com>
>> wrote:
>> > I agree with keeping things simple.
>> > However, a minimalistic way of adding some 'social open' notions could
>> > be
>> > enough for now:
>> >
>> > findability (such as datasets described in a way that my average self
>> > can
>> > find it, without learning Dept X particular lingo)
>> > such as having a contactperson and e-mail address mentioned with a
>> > dataset
>> > e.g.,
>> > a way of giving feedback on data sets etc,
>> > showing contextual provenance other than 'Dept X published this' and
>> > more
>> > along the lines: this was collected for task x by body y, and used in z
>> > way,
>> > and things like when it will be next updated.
>>
>> Okay -- I agree that this is useful. Lets try and formulate it. Say a
>> local government body puts a spreadsheet (tick: machine readable,
>> technically open) online on their website at <some.gov.xa/data> under
>> CC0 (tick: public domain, legally open). It is *nearly* there -- but
>> how do we know whether the material they've uploaded is open
>> government data? Do they need to do one of a short list of things to
>> make sure its socially open? All of a short list of things? Is their
>> URL enough? Say they put a news item on their press section? Or Tweet
>> it? Do they need to have an event? Should they solicit for feedback?
>> Bear in mind we are focusing on open government *data* rather than
>> open *government*, per se. How can we capture the social openness in a
>> sentence or two, and ensure that it is clear enough that a non-expert
>> could apply the rule in a majority of cases (like a dataset being
>> machine readable or not, or a license being open or not).
>>
>> > none of those are tech-aspects or legal aspects, but important
>> > nonetheless
>> > to render a data set useful.
>> > the whole 'stay in touch with all your stakeholders' 'community
>> > building'
>> > 'being a platform for re-users' can be part of the natural growth path
>> > on
>> > top of the minimalistic definitions.
>> >>Also are we saying that governments should do social stuff on PSB
>> >>websites
>> >
>> > My answer would be yes. It's called interacting with citizens, and a
>> > primary
>> > ingredient of having a public sphere at all. I'd say 'doing social
>> > stuff' is
>> > a core task of gov :)
>>
>> Yes indeed! Sorry should have said: is it essential for *open
>> government data* that government does social stuff. I.e. should PSIH's
>> be required to 'do social stuff' *in order* for their data to be
>> considered open? A very different question from should governments 'do
>> social stuff' full stop. ;-)
>>
>> > Also indications are pretty strong that it's the 'socially open' aspects
>> > that ultimately drive the adoption of re-use.
>>
>> I think it can really depend on the context. E.g. in the UK there was
>> a flourishing civic hacker community *before* the Cabinet Office
>> started funding hackdays, or, indeed, before it launched data.gov.uk.
>> Folks had to look a lot harder for the data in those days -- but the
>> absence of social openness wasn't necessarily the main blocker. I know
>> that in several countries where open government data isn't on the
>> agenda at all, there are communities who are keen to get hold of
>> certain datasets. I'd be interested in hearing more anecdotes about
>> this but I get the impression that in many cases prospective reusers
>> know what they are looking for, and the key thing is getting it under
>> an open license in a form which isn't unusable (e.g. PDF, weird legacy
>> database, ...).
>>
>> Of course if governments who don't have a flourishing (prospective)
>> re-user community already want to see results fast, they may do stuff
>> to catalyse uptake, or increase impact of opening up. The question is
>> do we want to *require* this in a definition of open government data?
>> Could this not be setting the bar quite high for, e.g. some countries
>> where governments may have very limited budget?
>>
>> > As well as it seems the way to
>> > take away unarticulated fears of data holders.
>> > These data sets become objects of sociality, creating and sustaining
>> > conversations with and around gov. To not make sure there's a conduit
>> > for
>> > that interaction is setting it up to fail. As the example of opening
>> > landownership data in Bangladesh shows us.
>>
>> Indeed -- but my impression is that this is not necessarily something
>> that an email address, feedback form or data catalogue would fix. But
>> point taken. ;-)
>>
>> > All in all, I think 'social stuff' is key.
>> > It may very well be that part of the resulting interaction need not be
>> > connected to a singular dataset but rather to a corpus of datasets, such
>> > as
>> > a data catalogue.
>> > Maybe my point is that if you posit this as a technology or legal driven
>> > thing only, gov's will miss why it's important and that will make the
>> > open
>> > definition become self-defeating to a certain extent.
>>
>> *Absolutely* agree that social stuff is important. My question is
>> whether this should be dealt with in our minimalist bare-bones
>> definition, or in ancillary material. E.g. on opengovernmentdata.org,
>> the open data manual, etc. I feel I could be persuaded either way and
>> would love to hear what other folks think!
>>
>> As an analogy: what of the free/open source software approach
>> (collaborative development, methodology, etc) can you find in the
>> free/open source definition?
>>
>> All edits/comments most welcome! ;-)
>>
>> http://opengovernmentdata.okfnpad.org/definition
>>
>> All the best,
>>
>> Jonathan
>>
>> > best,
>> > Ton
>> > On Tue, Oct 19, 2010 at 7:29 PM, Jonathan Gray <jonathan.gray at okfn.org>
>> > wrote:
>> >>
>> >> Yes agree this is very important, and we wrote about aspects of this
>> >> in several recent reports [1].
>> >>
>> >> However, I strongly feel that for present purposes the definition
>> >> should be (i) *very very* simple (as easy as possible to determine
>> >> compliance) and (ii) unambiguous to evaluate. How would one determine
>> >> if something is socially open? Would it be clear cut in every case?
>> >> Also thinking of free/open source software definitions do we perhaps
>> >> want to separate between subject matter (data) and surrounding
>> >> processes (how it is published, social openness) for purposes of a
>> >> definition, even though both are important?
>> >>
>> >> Also are we saying that governments should do social stuff on PSB
>> >> websites, or do also want to enable and encourage innovation from
>> >> outside government? A major point in Tom Steinberg/Ed Mayo's excellent
>> >> Power of Information report [2].
>> >>
>> >> Jonathan
>> >>
>> >> [1] cf. e.g. http://writetoreply.org/beyondaccess/4-1-discoverability/
>> >> and http://www.unlockingaid.info/3/
>> >> [2] http://www.opsi.gov.uk/advice/poi/power-of-information-review.pdf
>> >>
>> >> On Tue, Oct 19, 2010 at 6:16 PM, Ton Zijlstra <ton.zijlstra at gmail.com>
>> >> wrote:
>> >> > Hi Jonathan,
>> >> > Maybe we can add a component 'socially open' as well? Just this week
>> >> > I
>> >> > saw
>> >> > the results of a study about municipal websites in the Netherlands,
>> >> > that
>> >> > had
>> >> > as a result that while information and service were nominally
>> >> > available
>> >> > as
>> >> > the law dictates, it was all very well hidden deep in websites to the
>> >> > point
>> >> > of uselessness. No 'social openness' in short, as in findable,
>> >> > connected
>> >> > to
>> >> > contexts etc., and absence of dialogue with re-users, feedback
>> >> > possibilities
>> >> > for re-users towards PSB's etc.
>> >> > Those three components, legally open, technically open, socially open
>> >> > were
>> >> > also the components that floated to the foreground while we were
>> >> > writing
>> >> > on
>> >> > the Open Data Manual in Berlin earlier this month.
>> >> > best,
>> >> > Ton
>> >> > -------------------------------------------
>> >> > Interdependent Thoughts
>> >> > Ton Zijlstra
>> >> >
>> >> > ton at tonzijlstra.eu
>> >> > +31-6-34489360
>> >> >
>> >> > http://zylstra.org/blog
>> >> > -------------------------------------------
>> >> >
>> >> >
>> >> > On Tue, Oct 19, 2010 at 7:00 PM, Jonathan Gray
>> >> > <jonathan.gray at okfn.org>
>> >> > wrote:
>> >> >>
>> >> >> We'd like to start a process to encourage key stakeholders in the
>> >> >> (rapidly growing!) world of open government data to have some
>> >> >> consensus on what 'open government data' means. This would be a
>> >> >> 'bare
>> >> >> minimum' that would need to be complied with in order to be called
>> >> >> OGD, not a wish list in an ideal world in perfect conditions.
>> >> >>
>> >> >> We already have several sets of principles [1], but many of these
>> >> >> are
>> >> >> quite jurisdiction specific -- e.g. according to 8 principles the
>> >> >> Australian, New Zealand and UK governments don't have any open
>> >> >> government data as it isn't 'license free', and the UK principles
>> >> >> are
>> >> >> clearly only intended for the UK (and it would be good not to have a
>> >> >> different set of standards for each country!).
>> >> >>
>> >> >> We'd like something *really* simple that we can start to try to
>> >> >> build
>> >> >> consensus around. Hence I'd like to start discussion around a basic
>> >> >> definition/standard that we can all start to encourage the adoption
>> >> >> of, to distinguish open government data from e.g. a bunch of PDFs
>> >> >> published on a website with no information about reuse, or an API
>> >> >> with
>> >> >> restrictive terms of use.
>> >> >>
>> >> >> I envisage this as having two key components:
>> >> >>
>> >> >> (i) legally open (as in opendefinition.org)
>> >> >> (ii) technically open (i.e. machine readable, available to download
>> >> >> in
>> >> >> bulk)
>> >> >>
>> >> >> (i) would be to make sure that we don't start calling stuff 'open
>> >> >> government data' which:
>> >> >>
>> >> >> * doesn't explicitly let the public reuse it for any purpose
>> >> >> (whether as a result of national copyright law, or departmental
>> >> >> policy)
>> >> >> * doesn't permit derivative works
>> >> >> * doesn't permit commercial reuse
>> >> >>
>> >> >> (ii) would be to make sure that material is not *only*:
>> >> >>
>> >> >> * available via an API
>> >> >> * available in non-machine readable formats, where machine readable
>> >> >> copies exist
>> >> >>
>> >> >> I've started a draft along these lines at:
>> >> >>
>> >> >> http://opengovernmentdata.okfnpad.org/definition
>> >> >>
>> >> >> Any input/comments would be very much appreciated! We'd ideally like
>> >> >> something ready at or just before Open Government Data Camp in
>> >> >> London!
>> >> >>
>> >> >> http://opengovernmentdata.org/camp2010/
>> >> >>
>> >> >> All the best,
>> >> >>
>> >> >> Jonathan
>> >> >>
>> >> >> [1]
>> >> >>
>> >> >>
>> >> >> http://sunlightfoundation.com/policy/documents/ten-open-data-principles/
>> >> >> http://resource.org/8_principles.html
>> >> >> http://razor.occams.info/pubdocs/opendataciviccapital.html
>> >> >>
>> >> >>
>> >> >>
>> >> >> http://blog.okfn.org/2010/06/28/new-uk-transparency-board-and-public-data-principles/
>> >> >>
>> >> >> --
>> >> >> Jonathan Gray
>> >> >>
>> >> >> Community Coordinator
>> >> >> The Open Knowledge Foundation
>> >> >> http://blog.okfn.org
>> >> >>
>> >> >> http://twitter.com/jwyg
>> >> >> http://identi.ca/jwyg
>> >> >>
>> >> >> _______________________________________________
>> >> >> open-government mailing list
>> >> >> open-government at lists.okfn.org
>> >> >> http://lists.okfn.org/mailman/listinfo/open-government
>> >> >
>> >> >
>> >>
>> >>
>> >>
>> >> --
>> >> Jonathan Gray
>> >>
>> >> Community Coordinator
>> >> The Open Knowledge Foundation
>> >> http://blog.okfn.org
>> >>
>> >> http://twitter.com/jwyg
>> >> http://identi.ca/jwyg
>> >
>> >
>>
>>
>>
>> --
>> Jonathan Gray
>>
>> Community Coordinator
>> The Open Knowledge Foundation
>> http://blog.okfn.org
>>
>> http://twitter.com/jwyg
>> http://identi.ca/jwyg
>>
>> _______________________________________________
>> open-government mailing list
>> open-government at lists.okfn.org
>> http://lists.okfn.org/mailman/listinfo/open-government
>
>
>
--
Jonathan Gray
Community Coordinator
The Open Knowledge Foundation
http://blog.okfn.org
http://twitter.com/jwyg
http://identi.ca/jwyg
More information about the open-government
mailing list