[MyData & Open Data] distinctions between personal and open
javier at openrightsgroup.org
Thu Jul 25 11:15:35 UTC 2013
no-one from government is properly acknowledging the risks of anonymisation. Following on the example of EE, ministers like Vaizey have said that because the data is anonymised they have nothing else to say. Cameron has not mentioned risks on anonymised health data http://news.techeye.net/security/uks-anonymous-health-records-are-wide-open
Some types of data handling activities are generally acknowledged as inherently risky. If you claim to have a very secure system for carrying CDs with unencrypted sensitive data in public transport, ministers would be demanding to know more. Anonymisation is not there yet.
I will keep you in the loop about mobile companies. Our first report does not deal with anonymisation processes, as there are other more basic problems. But we are looking at proposing better practices in this area as well. For example, we found out that EE was only assessing the risks of individual queries, not the cumulative/tracker effects. Their line is that it was a pilot and they would do it properly when fully implemented. But you need to start with these considerations, not bolt them on afterwards.
On 24 Jul 2013 18:40, "Mark Elliot" <mark.elliot at manchester.ac.uk (mailto:mark.elliot at manchester.ac.uk)> wrote:
> Hi Javier,
> The notion that “we can publish anything online if its anonymised” is indeed wrong headed and anyone touting such ideas needs to be corrected immediately. I would be grateful if you would let us know of specific instances of such language and we will intervene.
> The sensitivity issue is very important and I agree that location data is particularly pernicious. I would happily participate in any attempts to haul mobile data companies into shared good practice; let me know if I can be of any assistance.
> Mark Elliot
> Centre for Census and Survey Research
> School of Social Sciences
> University of Manchester
> M13 9PL
> t: 0161-275-4257
> f: 0161-275-4722
> From: javierruizorg at gmail.com (mailto:javierruizorg at gmail.com) [mailto:javierruizorg at gmail.com] On Behalf Of Javier Ruiz
> Sent: 24 July 2013 17:16
> To: Mark Elliot
> Cc: stef; mydata-open-data at lists.okfn.org (mailto:mydata-open-data at lists.okfn.org); Elaine Mackey
> Subject: Re: [MyData & Open Data] distinctions between personal and open
> I think the problem is that again and again we are told that you can publish anything online if it's "anonymised".
> While people in this list may be aware of Paul Ohm (who is a lawyer that only represented other people's research), most policy makers are covering their ears and singing out loudly.
> Besides the context, there are also specific data types that are more sensitive. Location is one such a case that it is almost impossible to anonymise if you keep individual records. This has huge implications for mobile data reuse.
> ORG is chasing UK mobile companies to agree some best practice for their reuse of personal data to produce aggregated data products and services. Besides voluntary industry agreement, there are some important legal aspects that are typically overlooked. We will publish a report on this very soon.
> On 24 Jul 2013 12:54, "Mark Elliot" <mark.elliot at manchester.ac.uk (mailto:mark.elliot at manchester.ac.uk)> wrote:
> Sent from my iPad
> On 24 Jul 2013, at 11:13, "stef" <s at ctrlc.hu (mailto:s at ctrlc.hu)> wrote:
> > On Wed, Jul 24, 2013 at 10:06:49AM +0000, Mark Elliot wrote:
> >> So you really do not make a distinction between publishing full tax records on the internet with full identifiers
> >> and a set of univariate aggregate tables held in side safe data lab that only one person sees?
> > that is a bit more than anonymization, that is also "not publishing" no?
> >> Ohm, in short, is wrong because he fails to model the context.
> > can you elaborate that? i think the reason is exactly, that he considers any
> > data in a context, and not a vacuum.
> Firstly, Context is rather more than simply the publishing/not publishing distinction you have to consider what data environment the data exists on. Something that could be anonymised in one data environment but not In another. Clearly full publication is the most liberal environment therefore requires the highest level of caution.
> Secondly, Ohms scenario is unrealistic. He uses differential privacy which assumes that intruder has all but one of the pieces of information in the data set and that all the information that he has is 100% convergent with that in the data.
> Put this another way: Ohms paper was rather like the guy who rushes into a room full of car designers and screams "everyone... listen up... we have just discovered that if you drive cars into a brick wall at 100 miles and hour then people get hurt, we have to stop building them!!!!"
> > --
> > pgp: https://www.ctrlc.hu/~stef/stef.gpg
> > pgp fp: FD52 DABD 5224 7F9C 63C6 3C12 FC97 D29F CA05 57EF
> > otr fp: https://www.ctrlc.hu/~stef/otr.txt
> MyData-Open-Data mailing list
> MyData-Open-Data at lists.okfn.org (mailto:MyData-Open-Data at lists.okfn.org)
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the mydata-open-data