[School-of-data] Is data open if you can not create derivatives?

Simon Cropper simoncropper at fossworkflowguides.com
Fri Jul 11 02:44:18 UTC 2014

Hi Everyone,

I am currently collating some public datasets on nutrition as fodder for 
some community data wrangling projects and events I am planning.


Interestingly I have discovered that most datasets are released under a 
limited license -- put simply, the custodian agencies allow for 
redistribution of the dataset with attribution but derivatives of any 
kind are not allowed or allow you to look at their data but not even 
download them. Most nutritional datasets released worldwide fall into 
this category[1].

To me this data is not open. Open data, in my mind, should allow for 
derivatives to be created and redistributed. I understand that agencies 
wish to be attributed and in some cases disclaimers included with any 
derivatives as means of indemnifying the source agency, but having 
constraints on 'working [2]' on and 'working' with the data makes the 
dataset of no real value.

For the record -- the only datasets I have confirmed allow derivatives 
is the Australian, USA and Swiss Datasets, and maybe the UK dataset 
(still waiting for official confirmation of the ambiguous license 
documentation on the UK website).

What is your opinion regarding 'openness'?

When looking at licenses or terms of use statements, what attributes are 
you looking for? Do you have a preferred license type?

Open Data in my mind has the following attributes:
- freely downloadable/accessible in a common data format
- the data is clearly described so other people can understand
   what they are seeing (e.g. no undefined acronyms)
- the methodology and sources of the information presented,
   and any inherent problems, are clearly described and this document
   freely available
- there are no restrictions on working with the data and redistributing
   your results (attributing the source and including disclaimers are
   not considered to be restrictions in this definition)

[1] This statement is based on inspection of the term of use for all the 
database identified from General Internet Searches using Google and all 
the databases specified in the list of Food Composition Databases 
managed by EuroFin (http://www.eurofir.org/?page_id=96)
[2] Definition of 'working' -- cleansing, standardizing, wrangling, 
munging, coding, geocoding, summarizing, graphing, analyzing, etcetera.

Cheers Simon

    Simon Cropper - Open Content Creator

    Free and Open Source Software Workflow Guides
    Introduction               http://www.fossworkflowguides.com
    GIS Packages           http://www.fossworkflowguides.com/gis
    bash / Python    http://www.fossworkflowguides.com/scripting

More information about the school-of-data mailing list