[MyData & Open Data] distinctions between personal and open

Wed Jul 24 11:54:42 UTC 2013

Sent from my iPad

On 24 Jul 2013, at 11:13, "stef" <s at ctrlc.hu> wrote:

> On Wed, Jul 24, 2013 at 10:06:49AM +0000, Mark Elliot wrote:
>> So you really do not make a distinction between publishing full tax records on the internet with full identifiers
>> and a set of univariate aggregate tables held in side safe data lab that only one person sees?
> 
> that is a bit more than anonymization, that is also "not publishing" no?
> 
>> Ohm, in short, is wrong because he fails to model the context.
> 
> can you elaborate that? i think the reason is exactly, that he considers any
> data in a context, and not a vacuum.

Firstly, Context is rather more than simply the publishing/not publishing distinction you have to consider what data environment the data exists on. Something that could be anonymised in one data environment but not In another. Clearly full publication is the most liberal environment therefore requires the highest level of caution.

Secondly, Ohms scenario is unrealistic. He uses differential privacy which assumes that intruder has all but one of the pieces of information in the data set and that all the information that he has is 100% convergent with that in the data. 

Put this another way: Ohms paper was rather like the guy who rushes into a room full of car designers and screams "everyone... listen up... we have just discovered that if you drive cars into a brick wall at 100 miles and hour then people get hurt, we have to stop building them!!!!" 

> -- 
> pgp: https://www.ctrlc.hu/~stef/stef.gpg
> pgp fp: FD52 DABD 5224 7F9C 63C6  3C12 FC97 D29F CA05 57EF
> otr fp: https://www.ctrlc.hu/~stef/otr.txt