[MyData & Open Data] distinctions between personal and open

Tue Jul 23 19:22:10 UTC 2013

From all the conversations, there's a clear split in interests. In both parts, we need better and clearer examples. You should be able to download your phone records in electronic form, but I should be to stop you paying money to get data containing mine

The two aspects:

1. mydata -- a single individual obtaining data about themselves from companies that hold it.
	- this is the midata programme in the UK run by the UK Government, and similar elsewhere
	- there are a bunch of questions about it, but those are mostly implementation.
It may be that this list wants to put together a set of principles and requirements of this. 
Points made so far include:
	- machine readable
	- reuse must be the discretion of the individual (whether apps or upload to a service)
	- copyright issues should be clear

Is anyone involved in the midata pilots? Is it worth OKF putting together a cohort of volunteers who are going to build interesting reuse projects? (ideally ones that wont accidentally torpedo the whole programme, so tread a little carefully in public). 

Fundamentally, data given to an individual about that individual is subject to decisions by that individual. 

myData/miData is solely about you getting your data; there is a wider version which is getting some access to other people's data. This is an emerging area which got confused into the above, but which is much more important from an open data and a privacy perspective. It got named "aggordata" in a conversation I've had, and I've not seen a better name for it (got one?)

2.Aggrodata
        - publication of customer/transaction data by companies in a way which is "deidentified".
		- this is what Everything Everywhere got caught doing to the police
		- this is what Telefonica do -- http://dynamicinsights.telefonica.com
		- Barclays too.
	- questions of proper anonymisation vs research
	- consent
	- access by individuals within the dataset

The commercial aspects of this are evolving much faster, and there should be a set of principles developed to discuss what it looks like. Some will be open data, some will not be as it's data sold commercially -- although there is a strong case on consent for data in which an individual contributes, to be accessible to them. There is a large question over what anonymous means, in a way which isn't followed by "oops". Tom Steinberg talked about this at open Governemnt Data camp in London in 2010, in a way which is still relevant (video snarfled here: http://www.youtube.com/watch?v=eN0beOAvlGM - text: http://steiny.typepad.com/premise/2010/11/open-data-how-not-to-cock-it-up.html). Privacy International (my day job) has been trying to have some of that conversation for a while, and some thoughts have gone up (https://t.co/6M626HVxWR). This should be a public conversation, some of which can will take place here, other parts in places of your choosing. What else should it include?

It's also not that new of an area in other ways. Governments have been doing Statistical Disclosure Control on administrative and transactional data they release for decades. Some of this includes differential release - different detail to different audiences. There is a clear case for individuals to have their own control of this data. Gov "travel to work" data gets fiddled around Cheltenham, so that they don't accidentally reveal where all the spooks live. Do O2 do the same for people who leave their phone in their car as they can't take it into the office?

but fundamentally, for both, we need better examples. Have OKF approached any of the new ODI sponsors to look at what they can do in terms of Open? The most likely companies to do this well are the ODI early adopters. 

Regards
Sam

--
@smithsam