[open-science] Big and open data – who should bear data transfer costs?
l.bolikowski at icm.edu.pl
Sat May 17 14:51:53 UTC 2014
Thanks for your input. Let me address some issues raised.
I share Emanuil view that charging for data transfer is a bit like
charging for shipment of DVDs or hard disks – it is not unreasonable to
ask the receiver of a package to bear the shipment costs. Actually,
above certain size it is much faster and probably more cost-effective to
ship hard disks than to transfer data over internet.
Raniere is obviously right that any data transfer arrangement cannot
limit the right to republish, or any other rights stated in an open
license. Nobody is questioning that.
Intuitively, I agree with Peter and Paweł that there should always be a
free-of-charge way of accessing a data set (a freemium model), but for
resources *of certain magnitude*, the free plan in a freemium model
would be so inconvenient that it would amount to hypocrisy to still call
As an example, let's take the 1000 Genomes data set
(http://aws.amazon.com/datasets/4383) with over 200 TB of data available
on Amazon Web Services. With transfer rate capped at 1 MB/s (not
unreasonable for the free plan in a freemium model) it would take over 6
years to download it. Using BitTorrent could *somewhat* help the next
downloaders, but the first one would still have to wait over 6 years for
their download to complete!
(Note: for the sake of simplicity, I'm ignoring the actual license of
the 1000 Genomes data set and Amazon's pricing, let's imagine the
license is CC-0 and Amazon allows some free-of-charge, capped data
On 05/17/2014 03:52 PM, Peter Murray-Rust wrote:
> Sorry - I phrased my answer poorly. I did not mean that everyone must
> pay. There always should be a basic free-of-charge service. However if
> people cannot or do not want to use that then is is reasonable to charge
> for making it available in easier-to-use forms.
> We have recently found a comparable example in Free/Open software where
> the code is distributed as GPL and anyone can download and compile it.
> However this is technically difficult for people who don't know how to
> compile code. The authors have made compiled versions available for a fee.
> On Sat, May 17, 2014 at 1:33 PM, Paweł Szczęsny <ps at pawelszczesny.org
> <mailto:ps at pawelszczesny.org>> wrote:
> 2014-05-17 13:41 GMT+02:00 Peter Murray-Rust <pm286 at cam.ac.uk
> <mailto:pm286 at cam.ac.uk>>:
> > This point has been raised over the years and personally I think it's
> > reasonable to pay for technical costs involved, but they should be
> > **transparent** and acceptable.
> Indeed this point has been raised many times over the years, but each
> time the idea of a payment for technical infrastructure during
> download was trashed as _unreasonable_, at least when discussion
> concerned research institutions in Europe or US. This is the role of
> funding for research infrastructure (and in some places certain taxes)
> to cover such a cost.
> The other thing is that such a payment is essentially a paywall.
> Technical issues aside (all Emanuil wrote is a very valid point),
> putting open stuff behind an obligatory paywall is a bad move from the
> PR point of view. Think Elsevier requiring payment for OA articles.
> That said, some institutions experiment with a freemium model in this
> area, which looks a bit better. You get the data for free, but if you
> need it fast (fast lane, special protocol, etc.) you need to pay a
> Best wishes
> Peter Murray-Rust
> Reader in Molecular Informatics
> Unilever Centre, Dep. Of Chemistry
> University of Cambridge
> CB2 1EW, UK
> open-science mailing list
> open-science at lists.okfn.org
> Unsubscribe: https://lists.okfn.org/mailman/options/open-science
Dr. Łukasz Bolikowski, Assistant Professor
Centre for Open Science, ICM, University of Warsaw
Contact details: http://www.icm.edu.pl/~bolo/
More information about the open-science