[School-of-data] Is data open if you can not create derivatives?
Simon Cropper
simoncropper at fossworkflowguides.com
Fri Jul 11 02:44:18 UTC 2014
Hi Everyone,
I am currently collating some public datasets on nutrition as fodder for
some community data wrangling projects and events I am planning.
https://github.com/SimonChristopherCropper/Food_Data_Analysis
Interestingly I have discovered that most datasets are released under a
limited license -- put simply, the custodian agencies allow for
redistribution of the dataset with attribution but derivatives of any
kind are not allowed or allow you to look at their data but not even
download them. Most nutritional datasets released worldwide fall into
this category[1].
To me this data is not open. Open data, in my mind, should allow for
derivatives to be created and redistributed. I understand that agencies
wish to be attributed and in some cases disclaimers included with any
derivatives as means of indemnifying the source agency, but having
constraints on 'working [2]' on and 'working' with the data makes the
dataset of no real value.
For the record -- the only datasets I have confirmed allow derivatives
is the Australian, USA and Swiss Datasets, and maybe the UK dataset
(still waiting for official confirmation of the ambiguous license
documentation on the UK website).
What is your opinion regarding 'openness'?
When looking at licenses or terms of use statements, what attributes are
you looking for? Do you have a preferred license type?
Open Data in my mind has the following attributes:
- freely downloadable/accessible in a common data format
- the data is clearly described so other people can understand
what they are seeing (e.g. no undefined acronyms)
- the methodology and sources of the information presented,
and any inherent problems, are clearly described and this document
freely available
- there are no restrictions on working with the data and redistributing
your results (attributing the source and including disclaimers are
not considered to be restrictions in this definition)
[1] This statement is based on inspection of the term of use for all the
database identified from General Internet Searches using Google and all
the databases specified in the list of Food Composition Databases
managed by EuroFin (http://www.eurofir.org/?page_id=96)
[2] Definition of 'working' -- cleansing, standardizing, wrangling,
munging, coding, geocoding, summarizing, graphing, analyzing, etcetera.
--
Cheers Simon
Simon Cropper - Open Content Creator
Free and Open Source Software Workflow Guides
------------------------------------------------------------
Introduction http://www.fossworkflowguides.com
GIS Packages http://www.fossworkflowguides.com/gis
bash / Python http://www.fossworkflowguides.com/scripting
More information about the school-of-data
mailing list