[open-science] Let us denonce the pseudo-open Public Library of Science

Thomas Kluyver takowl at gmail.com
Tue Feb 14 15:34:01 UTC 2017


On 14 February 2017 at 14:58, Paola Di Maio <paola.dimaio at gmail.com> wrote:

> 1. Does open data facilitate replicability? I argue that it does not. At
>> most, open data permits repeat analysis of the same data. This is a good
>> thing, but it is not replication. To replicate a study, one must repeat the
>> study, sometimes with variations to eliminate limitations of prior studies,
>> gather new data.
>>
>
> to replicate a study, one must repeate the study -
> assuming that by 'study' you mean the application of a methodology
>
>  but to replicate the result of a study, one needs the  exact data that
> the study has used. what about if I get different results from the same
> study (method)?  what would that imply?
>

To me, the key here is that a lot of modern science hinges on how you
analyse the data. A classic experiment like the candle in a jar pulling up
water has a clear result which doesn't require much analysis. But modern
research often involves trying to determine whether a pattern or difference
in some numbers represents a real phenomenon or just random chance. Things
like confounding correlated factors, multiple tests and so forth can make a
big difference. When more of the steps involve slicing and dicing numbers
after the experiment itself, replication of the method to get from raw data
to conclusions becomes more important.

The polling for the US election is a good example of this. Almost all
pollsters predicted a Clinton win, with varying degrees of confidence. We
know how that turned out. I don't believe they were fabricating the raw
poll results, but their segmentation and 'likely voter' adjustments weren't
quite right. I've no idea if they release the raw data from that, and there
may be issues with personal identifiability, but it would be interesting to
reproduce their headline results and do some sensitivity analysis to see
what assumptions might have been incorrect.

We can also reproduce the analysis steps much more easily than steps that
involve physical experiments, polling, etc. So sharing raw data is a useful
part of replication, though clearly not the whole story.

Thomas
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.okfn.org/pipermail/open-science/attachments/20170214/44fef66c/attachment-0003.html>


More information about the open-science mailing list