[MyData & Open Data] distinctions between personal and open

Mark Elliot mark.elliot at manchester.ac.uk
Wed Jul 24 17:09:38 UTC 2013


The issue about whether we should be collecting such data in the first place is, as you indicate, very important and undoubtedly we - the human race - should be making informed collective decisions about what sort of (information) society we are wanting to live in and undoubtedly the power to make those decisions is even more (over)-concentrated than the power to make other decisions. As a consequence therefore we are like the children of Hamlyn, collective walking towards the singularity without much of a thought.

To push the analogy (possibly I confess to breaking point); we can individually choose not to drive a car and so forgo those particular fruits of industrialisation however you could still be mown down by a careless driver (somebody mentions me on Facebook) and have to breathe in the background pollution (some data is collected about me regardless). 

Since this discussion is going on a public email list, both of us have - de facto - opted into the information system (we are in the car) and therefore are presumably aiming towards reformism rather than opting out and doing less participatory things....

However, none of this is the issue that I was drawn to comment on, which was whether it is possible to anonymise data and still have useful data and I was simply making the point that Ohm's position is at best a theoretical nicety and at worst completely misleading. Trying to instantiate the scenario - as you do below - with a big brother-like organisation learning the nth piece of information about citizen Y is just as misleading. Organisations with that sort of informational power do not have to spend resources deanonymising a third-party anonymised data set, there are plenty of more efficient and reliable way of getting that information if they want it. So the question about whether they could deanonymise the data or not is moot.


Mark Elliot
Centre for Census and Survey Research
School of Social Sciences
University of Manchester
M13 9PL
t: 0161-275-4257
f: 0161-275-4722

-----Original Message-----
From: stef [mailto:s at ctrlc.hu] 
Sent: 24 July 2013 17:35
To: Mark Elliot
Cc: Elaine Mackey; mydata-open-data at lists.okfn.org
Subject: Re: Re: [MyData & Open Data] distinctions between personal and open

On Wed, Jul 24, 2013 at 11:54:42AM +0000, Mark Elliot wrote:
> Firstly, Context is rather more than simply the publishing/not publishing distinction you have to consider what data environment the data exists on.

i disagree, the question is whether to collect data in the first place.
deanonimization is only one threat, the other is the uncontrolled leaking of data, look at the Manning case, 3 million people having access to classified information, and only one public leak? what do you think, how many criminal leaks were there? consider the vacuuming tactics of the NSA, we collect and store everything.

data-minimization is key, as long as you do not collect it cannot leak, and it cannot deanonymized. of course this only applies to data on persons who hold no power over society.

> Something that could be anonymised in one data environment but not In another. Clearly full publication is the most liberal environment therefore requires the highest level of caution.
> Secondly, Ohms scenario is unrealistic. He uses differential privacy which assumes that intruder has all but one of the pieces of information in the data set and that all the information that he has is 100% convergent with that in the data. 

why is this unrealistic for a determined adversary? as far as i see the NSA allows many of it's allies (NATO, GCHQ, BND) almost uncontrolled access to its data, i guess it's a bit more than 3million users that have access to this.

> Put this another way: Ohms paper was rather like the guy who rushes into a room full of car designers and screams "everyone... listen up... we have just discovered that if you drive cars into a brick wall at 100 miles and hour then people get hurt, we have to stop building them!!!!" 

the analogy is not quite fitting i believe. in your example the victim is acting on it's own when it's driving against the wall. in the data collection case the victims are driven against the wall.

pgp: https://www.ctrlc.hu/~stef/stef.gpg
pgp fp: FD52 DABD 5224 7F9C 63C6  3C12 FC97 D29F CA05 57EF otr fp: https://www.ctrlc.hu/~stef/otr.txt

More information about the mydata-open-data mailing list