[open-science] Openness and Licensing of (Open) Data

Rufus Pollock rufus.pollock at okfn.org
Mon Feb 9 15:47:34 UTC 2009


2009/2/6 Neylon <cameron.neylon at stfc.ac.uk>:
> Yishay, I am afraid it isn¹t short but I¹ve tried to summarise:
> http://blog.openwetware.org/scienceintheopen/2009/02/06/licenses-and-protoco
> ls-for-open-science-the-debate-continues/

Really nice summary Cameron (and I probably should have responded
direct here rather than to your earlier emails but my gmail threading
seems to have a gone a little awry -- my apologies for overburdening
the list).

[snip]

> Where there is disagreement is over what form this should take. Rufus
> Pollock started by giving the reasons why this should be a formal license.
> Rufus believes that a license provides certainty and clarity in a way that a
> protocol, statement of principles, or expression of community standards can
> not.  I, along with Bill Hooker and John Wilbanks [links are to posts on

Let's be really clear: everyone will need to apply a license (or
something very close) even if they are just making stuff PD (because
they need to formally waive whatever rights they have).

The remaining question is: will attribution and share-alike (if
included at all) be in licenses or in norms?

[snip]

> Scientific data has a history of being assumed to be in public domain (see
> the lack of any license at PDB or Genbank or most other databases) so there
> isn¹t the same sense of pushing back from an existing strong IP or licensing
> regime. However I think there is broad agreement that this protocol or

Is this really the case? Talking with e.g. Peter Murray-Rust about
what goes on Chemistry it seems to me there is a fair amount of
keeping data proprietary (and charging for it).

I'd also ask: isn't it likely that the future is going to see data
combined from a wider variety of areas? (Isn't that one of the reasons
we're having this debate). If so then maybe scientists need to think
not just about 'scientific' data but also about geodata or weather
data etc and many these areas are already heavily commercial (and
usually not open)

It's true that Bioinformatics seems pretty advanced on the openness
front we should note that this wasn't just by default but due to some
pretty serious, and explicit, effort (I've seen Tim Hubbard present a
few times here on the race between Celera and the Human Genome
project).

> statement would look a lot like a license and would aim to have the legal
> effect of at least providing clarity over the rights of users to copy,
> re-purpose, and fork the objects in question.

So what happens when you get incompatibility of 'norms'. Does
'flexibility' mean I can just ignore the norms as I see fit?

[snip]

> This is a real area of contention I think because some of us (including me)
> would see this in quite a positive light (data being used effectively in a
> commercial setting is better than it not being used at all) as long as the
> data is still both legally and technically in the public domain. Indeed this

Let's be clear: 'open' data must permit commercial use. Non-commercial
restrictions whether for content, code or data make the associated
material *not open*. See item 8 of:

<http://www.opendefinition.org/1.0>

I, for one, think it is really important to the long term viability of
the open data community, whether in science or other domains, that
business can get involved.

> is at the core of the power of a public domain declaration. The issue of
> finding the resources that support the preservation of research objects in
> the (accessible) public domain is a separate one but in my view if we don¹t
> embrace the idea that money can and should be made off data placed in the
> public domain then we are going to be in big trouble sooner or later because
> the money will simply run out.

I couldn't agree more though I am not sure how this relates to the PD
or licensing issue. Business has got involved with share-alike (GPL'd)
software (cf. Linux) and it seems to me that the most crucial thing
for business will be clarity and certainty which may, in fact, be
easier to supply with licenses rather than norms.

> On the flip side of the argument is a strong tradition of arguing that viral
> licensing and share alike provisions protect the rights and personal
> investment of individuals and small players against larger commercial
> entities. Many of the people who support open data belong to this tradition,
> often for very good historical reasons. I personally don¹t disagree with the
> argument on a logical level, but I think for scientific data we need to
> provide clear paths for commercial exploitation because using science to do
> useful things costs a lot of money. If you want people want to invest in

[snip]

Let's be really clear then: share-alike and attribution are both
compatible with commercial use (look at software!).

Also what's the license/norms distinction here: surely if the norms
include attribution or share-alike terms you expect them to be
observed by businesses? If so this question seems more about whether
attribution or share-alike are allowed *at all* not whether they are
encoded in licenses or norms.

Regards,

Rufus




More information about the open-science mailing list