[open-science] Ownership of data (Charles Oppenheim)

Peter Murray-Rust pm286 at cam.ac.uk
Mon May 19 15:33:48 UTC 2014

In a twitter conversation Charles Oppenheim - Professor Emeritus, UCL (I
think?) has given his opinion on ownership of data and also our move into
content mining. I reproduce his mail verbatim.

This is my take on UK/EU law.  US law is different. And remember, I'm not a

The question needs to be split up into two parts, i.e., what rights are
associated with data, and then separately, who owns the rights?


A single datum never enjoys any rights.  To attract protection, we must be
dealing with a collection of data.  There is no clear legal guidance on how
big a collection has to be before it can attract rights, but a good working
estimate in my view would be 10 or more pieces of datum mean the collection
potentially enjoys rights.

There are two types of right involved - database right and copyright. A
collection of data enjoys COPYRIGHT if the selection and arrangement of the
data has involved skill and judgement.  Thus, if I synthesise 20 compounds
and record the melting point of each one, there is no copyright in the
listing because I have used no skill or judgement in selecting which I
record and which I don't record.  But if I synthesise 100, but only record
the melting points of 20 of them, selected on some basis (potentially
pharmaceutically active, those with the highest melting points, etc.) then
that collection of 20 melting points can enjoy copyright. (If this sounds a
bit crazy to you, you'll find quite a lot of the law on this is
counter-intuitive, so bear with me).  Copyright will also protect the
layout, design and typography of a particular table or other way of
presenting some data.  This "publisher's copyright" lasts for 25 years from
data of publication and is a quite separate right to right in the data
collection itself.

A collection of data enjoys DATABASE RIGHTS if I have expended significant
time and effort in obtaining, verifying and presenting the data.  (Note
here, a data collection can enjoy copyright + database right, database
right alone, copyright alone, or have no rights).  Now "obtaining verifying
and presenting" sounds straightforward, but it isn't.  An important
European Court case concluded that if the data just happens to naturally
fall out of what you were doing, then it does NOT enjoy database right.
 So, if I synthesised a load of new compounds and routinely recorded the
melting point of each, I do not get database rights in the data. I only get
database rights if I take the data from somewhere else and then expend
effort in verifying and presenting the data.  It's counter-intruitive, but
that's the law now. (The case was British HorseRacing Board versus William
Hill).  To put it in a nutshell:  database rights protects work carried out
on PRE-EXISTING data, and does not protect NEWLY CREATED information.
Interestingly, this makes data obtained using TDM of pre-existing data
liable for database right protection;  you might want to ponder about the
implications of that fact bearing in mind the change to UK law just passed.


Assuming there are some rights associated with a collection of data, then
we have to work out who owns it.  Let us assume that the data has been
created/collected by one of the following types: a student; a research
associate/assistant; an academic; an employee in the private sector.  Let's
look at each in turn:

If a STUDENT has created the data as part of a project and they are
self-funded, or receive a grant, then the student owns the rights to the
data.  If anyone else wants to use the data, they must get permission from
the student.  The student cannot be forced to agree. (Any attempt to
REQUIRE the student to assign or license rights would be invalid in law;
get the student to voluntarily agree). If the student has been following
the guidance of a supervisor, there is an arguable case that the supervisor
is joint owner (see below for details of academics' rights), but still the
student's permission is required before any exploitation can take place. If
the student was being paid a salary to do the research, say as a vacation
job, then their position is as for research assistants.

If a RESEARCH ASSISTANT/ASSOCIATE has created the data, his or her employer
automatically owns the rights unless there is some agreement to the
contrary.  The employer might be private sector, a University, etc., etc.

If it is an ACADEMIC, then, depending on the precise contract of
employment, the employer, e.g., University, in theory owns the rights but
if custom and practice has left such rights to the academic, then that fact
over-rides the formal legal position. In practice, then, I would argue that
in most cases the academic owns the rights to the data. So if the data is
the result of an academic supervising a student, the two individuals
jointly own the rights to the data.  One cannot do anything with the data
without the permission of the other. So let's hope they get on with each

PRIVATE SECTOR EMPLOYEES are straightforward.  Their employer owns the

What about the grant funder?  The formal legal position is that they own NO
rights unless the terms of the grant funding say otherwise and custom and
practice does not over-ride the grant terms. The grant funder of a student
never has any rights unless there is an explicit contractual term to the

Of course, if there are no rights at all associated with some collection of
data, a third party can do what they like with it, including selling it. By
now I trust you are totally confused and wish you hadn't raised the
subject.  the whole subject is made more difficult by (a) people making
assumptions about what rights exist and (b) making assumptions about who
owns those rights, without exploring the detail.

I rest my case, m'lud


Professor Charles Oppenheim

Copied by...

Peter Murray-Rust
Reader in Molecular Informatics
Unilever Centre, Dep. Of Chemistry
University of Cambridge
