[open-science] Openness and Licensing of (Open) Data

Neylon cameron.neylon at stfc.ac.uk
Mon Feb 9 20:33:57 UTC 2009


Dear All

I was working on a half baked blog post which I think is superseded by a lot
of what others have written this afternoon. But I thought the following
might still be useful:

[wanting to define terms]

By ³public domain² I mean placed legally into the domain of works for which
any, copying, re-use or re-purposing is allowed without any restrictions.
While some objects are naturally in the public domain placing a ³work² in
the public domain usually requires a dedication of some sort or a disavowal
of any or all rights that the producer of the work may have. The Public
Domain Dedication protocol is an example of such. So placing data in the
public domain does not simply mean making it public, it means placing it in
a legal state where anyone is free to do whatever they like with it.

By ³license² I mean some form of contractual arrangement between the
provider of data and the user of data, such as a copyright notice,
declaration of database rights, or ³click wrap² agreement. As Yishay pointed
out in comments to my last post, a Public Domain Dedication looks quite a
lot like a license. However I don't think of it as such because a PDD
removes the need to give or seek permissions at all, removing the need for
any contractual arrangement between users and provider.

By ³data² I mean collections of data. As Jonathon Rochkind pointed out and
as we should all remember, a specific datum itself is in the public domain
by its nature. A fact of nature is not subject to copyright or intellectual
property law. Collections of data, and presentations of data, however can
acquire potential database or ³sweat of the brow² rights in certain
jurisdictions. It might be advisable for us to use the singular and plural
to draw the distinction. The line between a datum and data that defines when
these rights might be acquired is a difficult one, even for lawyers to
define, and this confusion forms part of the argument for public domain
dedication.

> Let's be really clear: everyone will need to apply a license (or
> something very close) even if they are just making stuff PD (because
> they need to formally waive whatever rights they have).

Agreed - either way there is a formal statement. I've not been thinking of
this as a license per se for the reasons above.
 
> The remaining question is: will attribution and share-alike (if
> included at all) be in licenses or in norms?

For me, share-alike only makes sense in the context of a license because it
only has any value if you can enforce it. As Michael has said community
enforcement of norms can only be through exclusion from the community. The
point of share-alike is to force people who would otherwise not to use the
same protocol/licence. Without a contractual arrangement there isn't any
real means of forcing people who choose to act outside the community.

But I also wonder whether we're actually worrying about something that isn't
very important. I think we are all agreed that non-commercial clauses are
not acceptable for legal, practical, and philosophical reasons. But this is
where the real battle is going to lie, not share-alike, which I don't think
most scientists would even have heard of. Looking at e.g. the licenses on
databases in the Nucleic Acids Research database issue you will see a
multitude of (probably unenforceable and certainly incompatible) licenses.
But virtually all of them have non-commercial clauses.

I wonder if our energies are better spent making that case and just
shuffling share-alike under the carpet. If we generally don't think its all
that helpful, and if I am right that most scientists won't care that much
(which remains to be seen) is it worth getting that stressed about?
  
[snipped - will aim to deal with this separately]

> I, for one, think it is really important to the long term viability of
> the open data community, whether in science or other domains, that
> business can get involved.

Absolutely agree but there is a strong sense of "anti-commercial" sentiment
within some parts of the open content movement and I think the current crop
of licences being applied to scientific databases also show an
anti-commercial attitude (or at least an "I want to make that money"
attitude). I want to come back to Heather's example of a commercial interest
exploiting the results of third world scientists at some point, but I think
there is something interesting there in about the way we use "exploit" as a
positive or negative term. This is a crucial argument to get right and
watertight. Particularly as making useful stuff from scientific results is
generally very expensive so profits have to be available to encourage it.

> Also what's the license/norms distinction here: surely if the norms
> include attribution or share-alike terms you expect them to be
> observed by businesses? If so this question seems more about whether
> attribution or share-alike are allowed *at all* not whether they are
> encoded in licenses or norms.

I agree that they are separate questions with the caveat that I think that
if you dedicate to the public domain then a "share-alike norm" is
meaningless. Citation norms will work because people placing work in the
community view will want to be seen as members of the community. Which
obviously I don't see as much of a loss ;-)

Cheers

Cameron

-- 
Scanned by iCritical.




More information about the open-science mailing list