[open-science] open-science Digest, Vol 17, Issue 7

Fri Mar 26 16:16:08 UTC 2010

Dear Chris,

Thanks for your reply. I think my comment may have been taken out of
context.

I don't mean to suggest we should be using consent as a reason to go
ahead and share identifiable information at will. When we can perturb
data so that its statistical value is retained but individual anonymity
protected, I agree that this is the best approach. But this is not
always possible or appropriate. As anonymity is so hard to achieve with
certainty, when there is a non-negligible risk that individuals might be
identified from published information we must obtain consent for
publication of that data. As well as data protection legislation journal
publishers and editors subscribe to codes of ethics that require consent
for publication of identifiable information, and that only data that are
scientifically relevant are shared (published). So editors should still
make every effort to preserve anonymity (by removing details not
relevant to the science of the article in question) *and* obtain
consent, unless data are anonymous with certainty (very hard to
achieve). The classic example is the medical case report but this has
become increasingly recognised in clinical research in general. The BMJ
now has a policy (informed by the guidelines I referred to earlier in
this thread) whereby any article including 3 or more indirect
identifiers (from a list of 28 potential items), such as age or sex,
must document whether consent for publication/data sharing was obtained.
Data tables in clinical research articles often include this level of
detail.

http://resources.bmj.com/bmj/bmj/authors/types-of-article/research
"We also strongly support the view that researchers should seek informed
consent to data sharing from research participants upfront, at the
recruitment stage. There are good ethical and practical reasons for
doing so. Even if the investigators have no current plans to share raw
data, at some future time data sharing may become the norm. If so,
sharing will be much easier if no one has to try to seek consent
retrospectively.

Consent is particularly important because participants may be
identifiable in a dataset - even an "anonymised" one that does not
contain names or addresses. The combination of three or more indirect
identifiers such as age, sex, and an unusual clinical detail may be
enough for at least the participant, or another interested party, to
recognise themselves.

Therefore, please provide at the end of your manuscript a data sharing
statement such as:

"Data sharing: technical appendix, statistical code, and dataset
available from the corresponding author at <email address or url>.
Participants gave informed consent for data sharing [or ...consent was
not obtained but the presented data are anonymised and risk of
identification is low... or consent was not obtained but the potential
benefits of sharing these data outweigh the potential harms because...]"

Best wishes,

Iain

Iain Hrynaszkiewicz
Managing Editor
BioMed Central
Floor 6, 236 Gray's Inn Road
London, WC1X 8HL
T: +44 (0)20 3192 2175
F: +44 (0)20 3192 2011
W: www.biomedcentral.com/

-----Original Message-----
From: Chris Rusbridge [mailto:c.rusbridge at ed.ac.uk] 
Sent: 25 March 2010 10:03
To: Iain Hrynaszkiewicz
Cc: Tom Moritz; John Wilbanks; open-science at lists.okfn.org
Subject: Re: [open-science] open-science Digest, Vol 17, Issue 7

Iain, I felt rather worried about possible implications of your
statement "consent for sharing from human subjects, as part of study
recruitment, is key to overcoming privacy barriers". Informed consent is
critical to doing any research at all on data from human subjects, but
AFAIK ethics committees would rarely allow open release of identifiable
data on those human subjects. The sole exception I am aware of would be
quotation of qualitative/interview responses, and the appropriate course
seems to be to go back to the subject showing the precise quote in
context and getting a release at that point.

The default position should surely be that prior informed consent is
never sufficient for open publication of identifiable data on human
subjects, so the data cannot be open in a BBB sense. It should be open
in the sense of being subject to scrutiny from a qualified person who
takes on the obligations of privacy. Properly de-identified or
aggregated data could possibly be made open provided steps are taken to
make it non-disclosive, even in combination with arbitrary other data,
although most data custodians want a step involving identification and
agreement to some kind of licence (again incompatible with BBB). I
believe the techniques for de-identification can involve adding noise to
some variables.

--
Chris Rusbridge
Director, Digital Curation Centre
Email: c.rusbridge at ed.ac.uk    Phone 0131 6513823
University of Edinburgh
Appleton Tower, Crichton St, Edinburgh EH8 9LE

The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.

On 24 Mar 2010, at 14:46, Iain Hrynaszkiewicz wrote:

> Dear John, and Tom,
>  
> I was excited to read your discussions of data standards and look
forward to the PLoS article (tomorrow?).
>  
> BMC Research Notes (http://www.biomedcentral.com/bmcresnotes/) is also
very interested in this area and is trying to encourage data-driven
publications, data harvesting and re-use; a corollary to which is
guidance and best practice for sharing in different science disciplines.
>  
> Best wishes,
>  
> Iain
>  
> PS. I absolutely agree consent for sharing from human subjects, as
part of study recruitment, is key to overcoming privacy barriers.
>