[open-science] "open science" definition?

Mon Oct 13 03:32:15 UTC 2014

Jenny Molloy Sun, 12 Oct 2014 19:57:06 +0100 [1]
>> my feeling is that ensuring that the knowledge and tools created through your research are available as per the Budapest Open Access Initiative (or Berlin Declaration), Open Knowledge Definition, Free Software Definition plus similar implementations for hardware and more specialised tools like seeds, cell lines, reagents, other materials (not all of which exist but most are being worked on) just about covers open scientific knowledge from a legal/technical perspective.

Tom Roche Sun, 12 Oct 2014 19:38:26 -0400 [2]
> That sounds about right[.] I suspect we can *quickly* get 95% of a useful Open-Science Definition 1.0 just by composing definitions of the main parts: open input data/assimilation (e.g., the Open Definition[3]), open processing (e.g. (computationally), open-source code in public repositories), open output data/analysis (e.g., open-access publishing--the OD seems applicable here as well).

Just to be blindingly obvious (unless the following exposes flaws to which I am currently blind):

I assert that we can, to a first approximation, model a scientific study like a classic *x pipeline[4]:

* input > transform > output

* open output (data or analysis) from any one study may become an input for any number of subsequent studies

That scientific studies seek to rigorously create and test empirical hypotheses is extraneous to our purpose, which is to characterize the openness of studies. Similarly, that input data is often subject to previous assimilation[5] and output data is usually subject to subsequent analysis (e.g., figure creation) is immaterial: for purposes of this model, all operations on data can be lumped into the single step=transform. 

Furthermore, for characterization of openness, inputs and outputs are equivalently data. Unless I'm missing something, the Open Definition (or similar) should apply to both--no?

That leaves characterization of the openness of a study's transform(s). To a first approximation, we can separately characterize every transform as non- or computational. For computational science, I'm assuming we could leverage prior art on openness of

* source codes and their repositories

* source platforms (e.g., the hardware, OS, or other software required to run the sources)

I know much less about openness of non-computational protocols (I'm just a coder who works on environmental models) but assume (absent contradiction from someone who actually *knows* about this space :-) that their openness has been defined by one or more domain experts: in the worst case, their openness could be defined, as with computational transforms, in terms of their reproducibility.

Am I missing something?

FWIW, Tom Roche <Tom_Roche at pobox.com>

[1] https://lists.okfn.org/pipermail/open-science/20141012/003550.html
[2] https://lists.okfn.org/pipermail/open-science/20141012/003551.html
[3] http://opendefinition.org/od/
[4] https://en.wikipedia.org/wiki/Pipeline_%28Unix%29
[5] https://en.wikipedia.org/wiki/Data_assimilation