[odc-discuss] Precluding re-identification?
andrewrens at gmail.com
Fri Nov 3 20:41:55 UTC 2017
It seems as if the Linux Foundation has also recently considered this
issue. They just released two new data licenses as part of a bigger data
ecosystem (see https://cdla.io/)
Members of the list may be inclined to critique these new licenses as
unnecessary given the existing data licenses such as those at
https://opendatacommons.org/licenses/ and CC 0. I am simply drawing
attention to this new development, not advocating for it, rather I would
like to see some discussion.
One difference is that "Each Data Provider represents that the Data
Provider has exercised reasonable care, to assure that: (a) the Data it
Publishes was created or generated by it or was obtained from others with
the right to Publish the Data under this Agreement; and (b) Publication of
such Data does not violate any privacy or confidentiality obligation
undertaken by the Data Provider."
According to the context document https://cdla.io/context/ it seems that
the Linux Foundation imagines those licenses as inbound licenses and that
there may in future be outbound licenses that attempt to preserve privacy
or prevent de-identification. As an alternative it suggests a database with
technical restrictions that allow only some kinds of research.
Open humans is also grappling with these issues (https://www.openhumans.org/)
I don't know if they have found solutions that you would find compelling.
On 2 November 2017 at 14:38, Howard Look <howard at tidepool.org> wrote:
> I'm new to odc-discuss. My apologies if this has been covered. I could not
> find a discussion with my searching.
> Our open source, non-profit company makes software that is used by people
> living with diabetes. We ask people to donate their diabetes device data
> via the Tidepool Big Data Donation <http://tidepool.org/bigdata> project,
> which many have done. We would like to now make some these anonymized
> datasets available publicly.
> I would prefer to not invent a new license, but it appears that the
> current ODC-By would allow for downstream re-identification. What do people
> do that would like to preclude re-identification of open, anonymized
> medical data?
> For example, people using our software sometimes tweet their continuous
> blood glucose graphs. It would not be hard to do analyze the image, and
> look for the same pattern of CBG readings in the anonymized database to
> identify a person.
> This restriction is intended to preclude Protected Health Information from
> being re-identified. However I understand why the license does not allow
> for further restrictions.
> What have other folks done who want to make individual medical records
> available but preclude identification?
> *Howard Look*
> President, CEO and Founder
> *Tidepool* is an open source, not-for-profit effort to make diabetes data
> more accessible, actionable, and meaningful by liberating data from
> diabetes devices, supporting researchers, and providing great, free
> software to the diabetes community.
> *Email: *howard at tidepool.org
> *Web: *Tidepool.org <http://tidepool.org/>
> odc-discuss mailing list
> odc-discuss at lists.okfn.org
> Unsubscribe: https://lists.okfn.org/mailman/options/odc-discuss
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the odc-discuss