[open-science] open-science Digest, Vol 5, Issue 1

Mags McGeever mags.mcgeever at ed.ac.uk
Wed Feb 4 12:37:26 GMT 2009


Hiya Rufus,

Thanks a lot for posting this.  I don't have time to look at it this 
week but I will make sure I do so shortly.  I for one would be 
interested in seeing the 'Facts and Databases'  and  'Comments on the 
Science Commons Protocol' appendices you mentioned.

Warm regards,
Mags

-- 
Mags McGeever, Legal Services Associate
Digital Curation Centre/SCRIPT Centre
University of Edinburgh
Old College, South Bridge
Edinburgh, EH8 9YL
Tel: 00 44 (0)131 651 3836  Fax: 00 44 (0)131 650 6317

My working days are Monday, Tuesday and Wednesday.  Should your email be 
sent on a Thursday or Friday I will receive it on the following Monday.

Interested in the legal aspects of digital curation?  Visit the DCC 
Blawg at http://dccblawg.blogspot.com/

 From time to time this email may contain legal information but this 
should not be construed as legal advice (the term has a legally 
significant meaning).

open-science-request at lists.okfn.org wrote:
> Send open-science mailing list submissions to
> 	open-science at lists.okfn.org
> 
> To subscribe or unsubscribe via the World Wide Web, visit
> 	http://lists.okfn.org/cgi-bin/mailman/listinfo/open-science
> or, via email, send a message with subject or body 'help' to
> 	open-science-request at lists.okfn.org
> 
> You can reach the person managing the list at
> 	open-science-owner at lists.okfn.org
> 
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of open-science digest..."
> 
> 
> Today's Topics:
> 
>    1. Openness and Licensing of (Open) Data (Rufus Pollock)
> 
> 
> ----------------------------------------------------------------------
> 
> Message: 1
> Date: Wed, 4 Feb 2009 12:01:49 +0000
> From: Rufus Pollock <rufus.pollock at okfn.org>
> Subject: [open-science] Openness and Licensing of (Open) Data
> To: Neylon <cameron.neylon at stfc.ac.uk>, Peter Murray-Rust
> 	<pm286 at cam.ac.uk>
> Cc: open-science at lists.okfn.org
> Message-ID:
> 	<b68e90820902040401o4cadc9d7m18a52d197e597edd at mail.gmail.com>
> Content-Type: text/plain; charset=UTF-8
> 
> Dear Cameron, Peter and others,
> 
> I've been planning to write a full piece about 'openness' and
> licensing in relation to data for a while. Prompted by recent
> discussions with John Wilbanks and then the post of Michael Nielsen on
> "The role of open licensing in open science" [1] I've put together a
> brief  'analysis' that I think might be useful.
> 
> The full analysis can be found below or, for those who prefer reading
> websites rather than lists, the text is also posted up at:
> 
> <http://blog.okfn.org/2009/02/02/open-data-openness-and-licensing/>
> 
> This is obviously an important debate for 'open science' and I'd be
> really interested in the views of others. I know that some of the
> points made in this piece aren't shared by everyone -- for example, I
> know John Wilbanks is not a fan of the license approach (or of
> share-alike conditions) :)
> 
> Lastly there are two appendices for this essay which I've left out in
> the interests of length. One is on 'Facts and Databases' the other is
> 'Comments on the Science Commons Protocol'. I'd be happy to send them
> along if anyone wants to them.
> 
> Regards,
> 
> Rufus
> 
> [1]:http://michaelnielsen.org/blog/?p=540
> 
> # Openness and Licensing of (Open) Data
> 
> ## Why does this matter?
> 
> Why bother about openness and licensing for data? After all they don't
> matter in themselves: what we really care about are things like the
> progress of human knowledge or the freedom to understand and share.
> 
> However, open data is crucial to progress on these more fundamental
> items. It's crucial because open data is so much easier to break-up
> and recombine, to use and reuse. We therefore want people to have
> incentives to make their data open and for open data to be easily
> usable and reusable -- i.e. for open data to form a 'commons'.
> 
> A good definition of openness acts as a standard that ensures
> different open datasets are 'interoperable' and therefore do form a
> commons. Licensing is important because it reduces uncertainty.
> Without a license you don't know where you, as a user, stand: when are
> you allowed to use this data? Are you allowed to give to others? To
> distribute your own changes, etc?
> 
> Together, a definition of openness, plus a set of conformant licenses
> deliver clarity and simplicity. Not only is interoperability ensured
> but people can know at a glance, and without having to go through a
> whole lot of legalese, what they are free to do. (For more see [this
> article][why] and [this post][explicit]).
> 
> **Thus, licensing and definitions are important even though they are
> only a small part of the overall picture. If we get them wrong they
> will keep on getting in the way of everything else. If we get them
> right we can stop worrying about them and focus our full energies on
> other things.**
> 
> [explicit]:http://blog.okfn.org/2006/08/08/dead-knowledge-why-being-explicit-about-openness-matters/
> 
> [why]:http://sciencecommons.org/weblog/archives/2008/08/18/voices-from-the-future-of-science-rufus-pollock-of-the-open-knowledge-foundation/
> 
> 
> ## Background
> 
> Over the last couple of years there has been substantial discussion
> about the licensing (or not) of (open) data and what 'open' should
> mean. In this debate there two distinct, but related, strands:
> 
>   1. Some people have argued that licensing is inappropriate (or
> unnecessary) for data.
>   2. Disagreement about what 'open' should mean. Specifically: does
> openness allow for attribution and share-alike 'requirements' or
> should 'open' data mean 'public domain' data?
> 
> These points are related because arguments for the inappropriateness
> of licensing data usually go along the lines: data equates to facts
> over which no monopoly IP rights can or should be granted; as such all
> data is automatically in the public domain and hence there is nothing
> to license (and worse 'licensing' amounts to an attempt to 'enclose'
> the public domain).
> 
> However, even those who think that open data can/should only be public
> domain data still agree that it is reasonable and/or necessary to have
> some set of community 'rules' or 'norms' governing usage of data.
> Therefore, the question of what requirements should be allowed for
> 'open' data is a common one, whatever one's stance on the PD question.
> 
> Of course, even with agreement on requirements, there is still the
> question of whether these should be 'enforced' through a license or
> via community norms. To summarize, the three main questions are:
> 
> **Qu 1. Is it important to license?**
> 
> **Qu 2: What 'restrictive' requirements are compatible with openness?
> In particular does 'open' equate to PD only or are attribution and
> share-alike 'requirements' permitted?**
> 
> **Qu 3: Community norms or licenses? Should 'community norms' or
> license terms be used in order to encode requirements such as
> attribution and share-alike?**
> 
> Below I look at each of these in turn, laying out, as I see it, the
> current consensus and expressing my own view.
> 
> 
> ## Question 1: Is it Important to License?
> 
> The simple answer here is yes. Whether one likes it or not there are a
> whole bunch of jurisdictions where there are IP rights in data(bases).
> Note that this does **not** imply any monopoly rights in any facts
> that data represents.
> 
> Thus, even if you just want your data to be in the 'public domain',
> you need to apply a license -- or something very closely resembling a
> license. (A suitable example is the Open Data Commons [Public Domain
> Dedication and License][pddl]).
> 
> [pddl]:http://www.opendatacommons.org/odc-public-domain-dedication-and-licence/
> 
> ## Question 2: What Should Openness Allow?
> 
> Despite the sometimes heated discussion, there is, in fact, broad
> agreement: openness means freedom to use and reuse data in any way you
> wish. The only debate is over what, if any, conditions can be imposed
> when allowing use and reuse. In particular, following the example of
> the software and content domains, the following two items have been
> proposed as permissible exceptions to the basic rule of 'allow
> everything':
> 
>   1. Requirement of attribution (in a non-burdensome manner)
>   2. Requirement to share-alike (a reuser or share-alike material
> must, when making publicly available their own material, make it
> openly available under a similar share-alike license)
> 
> ### Attribution
> 
> Everyone agrees that requiring attribution is OK. Furthermore, it also
> now generally accepted that having this requirement in a license is
> not be a problem.
> 
> (In the original [Protocol for Implementing Open Access
> Data][protocol] attribution was alleged to be problematic due to a
> potential for 'attribution stacking'. However, these concerns appear
> to have been allayed. To my mind, it was never clear why data needed
> to be different: code and content both have plenty of examples of
> projects with many contributors, much reuse *and* an attribution
> requirement).
> 
> ### Share-Alike
> 
> Share-alike provisions are more controversial. It has been argued that
> share-alike conditions are problematic because of the potential for
> incompatibility between two share-alike licenses (or community norms).
> At the same time share-alike may provide an important incentive for
> individuals and communities to make their data openly available since
> it provides some assurance that this data will remain open. Thus, any
> evaluation comes down to the balance between:
> 
> 1. The costs, if any, of allowing share-alike in terms of e.g.
> complexity and compatibility.
> 
> 2. The benefits, if any, that share-alike provides by encouraging the
> creation of open data in the first place and in ensuring subsequent
> 'sharing back' by those who build upon that data.
> 
> In my view the benefits are substantial while the costs are not.
> Incompatibility can largely be avoided by only 'approving' share-alike
> licenses that are compatible. At the same time, share-alike enshrines
> a principle that is important to many communities in the code and
> content spheres and same seems true of data (consider e.g. Open Street
> Map).
> 
> (Aside: it is important to emphasize that permitting share-alike does
> not mean it is must be used. In fact, a particular community could
> recommend against using share-alike as, for example, the Python
> community does for code hoping to make it into its standard library.)
> 
> ## Question 3: Licenses versus Community Norms
> 
> Even if a basic license is used it can be argued that any
> 'requirements' for attribution or share-alike should not be in a
> license but in 'community norms'. So which is best?
> 
> In my view, when making available data, licenses are much better than
> community norms. Why?
> 
>   1. A license is always needed even if you are taking a PD approach.
> So 'norms' don't obviate the need to license.
>   2. A license is able to encode 'norms' both formally and informally
> (for example, in a preamble -- cf. the GPL).
>   3. A license is likely to elicit at least as much, and almost
> certainly more, conformity with its provisions than community norms.
> This is especially true outside of the community. The future is likely
> to see a much more mixed data landscape whether in science or
> elsewhere with many 'non-community' (non-academic) business and among
> ordinary citizens. (Note also that for these groups the simplicity and
> formality of a license makes it superior to 'norms' in almost every
> respect -- transparency, certainty etc.
>     * If there are concerns that, in some jurisdictions, the absence
> of 'data' rights make e.g. share-alike provisions unenforceable
> nothing is lost by using a license: the license de facto reverts to
> the status of a community norm and any concerns regarding "false
> expectations" can easily be dealt with by a simple warning.
> 
> **Flexibility:** some have argued that 'norms' are more 'flexible'
> than licenses. I'm not clear what this really means:
> 
>   * Flexible = not enforceable. Perhaps true but I am unclear why this
> is an advantage (even to a user it is easy to comply with the open
> license)
>   * Flexible = leeway around the edges. For example I won't get in
> trouble if I don't attribute quite right. But this is true of licenses
> too: it is very unlikely anyone gets sued for a minor error in
> attribution and even with share-alike no court is likely to award
> damages for a mistake made in good faith -- especially if it can be
> easily corrected.
>   * Flexible = fuzzy. Fuzziness does not seem an attractive property
> when sharing data -- both sharer and sharee want clarity.
>   * Flexible = easily changed. Allowing major changes is a serious
> problem both for licensors and licensees (certainty and clarity would
> disappear). For minor changes licenses are just as good.
> 
> Thus, in every respect I can think of, licenses are superior to
> community norms when making available open data.
> 
> 
> ## Conclusion
> 
> Summarizing the the conclusions from the above discussion we have:
> 
> Qu 0: Does this matter?
> 
> **Yes.** A good definition of openness and the use of some form of
> licensing is crucial to a healthy future for the open data community
> (and that will be pretty much everyone ...).
> 
> Qu 1: Is it important to license?
> 
> Ans: **A 'license' is always necessary** -- even if you advocate a
> PD-only approach. There is too much variation (and uncertainty) about
> what the IP situation is across the world to just go with the default.
> All providers of data should apply some kind of license or PD
> dedication.
> 
> Qu 2: What 'restrictive' requirements are compatible with openness? In
> particular does 'open' equate to PD only or are attribution and
> share-alike 'requirements' permitted?
> 
> Ans: **Both attribution and share-alike should be permitted.**
> Attribution is widely agreed to be acceptable. The second,
> 'share-alike' is more controversial, but in my view should be allowed:
> there is no reason to break with the precedent set in code and content
> domains and its benefits seem substantial while costs are minimal if
> licenses are correctly managed.
> 
> Qu 3: Community norms or licenses?
> 
> Ans: **Use licenses when making available data.** Licenses provide all
> the benefits of community norms in terms of explicitly encoding the
> preferences of a community. At the same time they deliver greater
> clarity and transparency, and, in many jurisdictions, provides a legal
> enforceability which norms do not with regard to requirements of
> attribution or share-alike.
> 
> ## Colophon
> 
> This essay comes out of ongoing discussions over the last few years
> with a large assortment of communities and individuals. The primary
> motivation for sitting down and pulling the threads together came out
> of reading [Michael Nielsen's post on The role of open licensing in
> open science][nielsen] (+ [thread][nielsen-thread]) and recent emails
> with [John Wilbanks of Science Commons][sc] on the [Open
> Definition][od] coord list.
> 
> [nielsen]:http://michaelnielsen.org/blog/?p=540
> [nielsen-thread]:http://lists.okfn.org/pipermail/okfn-discuss/2009-January/001206.html
> [sc]:http://sciencecommons.org/
> 
> Related work and earlier discussion on this matter include:
> 
>   * The [Open Definition][od]
>   * The [Protocol for Implementing Open Access Data][protocol]
>   * The [Guide to Open Data Licensing][guide]
>   * [Open Data Discussion on SPARC Open Data List (2006)][2007]
>   * [Copyright Not Applicable to Geodata Post (2007)][geodata] (+
> [associated][geodata-l] [threads][geodata-l2])
>   * The [Open Data Commons][odc]
>   * [CCZero license][cczero]
> 
> [2007-01]:https://mx2.arl.org/Lists/SPARC-OpenData/Message/100.html
> [2007-02]:https://mx2.arl.org/Lists/SPARC-OpenData/Message/101.html
> [2007]:http://blog.okfn.org/2007/01/14/open-data-discussion-on-sparc-list/
> [od]:http://www.opendefinition.org/
> [geodata]:http://blog.okfn.org/2007/04/01/copyright-not-applicable-to-geodata/
> [geodata-l]:http://lists.okfn.org/pipermail/okfn-discuss/2007-April/000389.html
> [geodata-l2]:http://lists.okfn.org/pipermail/okfn-discuss/2007-April/000392.html
> [guide]:http://www.okfn.org/wiki/OpenDataLicensing
> [protocol]:http://sciencecommons.org/projects/publishing/open-access-data-protocol/
> [odc]:http://www.opendatacommons.org/
> [cczero]:http://creativecommons.org/projects/cczero
> 
> 
> 
> ------------------------------
> 
> _______________________________________________
> open-science mailing list
> open-science at lists.okfn.org
> http://lists.okfn.org/cgi-bin/mailman/listinfo/open-science
> 
> 
> End of open-science Digest, Vol 5, Issue 1
> ******************************************
> 

=

The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.




More information about the open-science mailing list