[open-science] Big and open data – who should bear data transfer costs?

Peter Murray-Rust pm286 at cam.ac.uk
Sat May 17 11:41:55 UTC 2014

Very important point.

This point has been raised over the years and personally I think it's
reasonable to pay for technical costs involved, but they should be
**transparent** and acceptable. "Open Foo" does not require the provider to
make positive efforts to make it "user-friendly" to re-users .

We faced this in the past and the solution then was to create an RSS
iterator over the data, allowing someone to download the whole lot. But we
don't have to positively help people to understand what RSS is an how to
use it.

On Sat, May 17, 2014 at 9:06 AM, Lukasz Bolikowski
<l.bolikowski at icm.edu.pl>wrote:

> Dear all,
> when compiling a list of big, open, publicly available, data sets for my
> students to use in their projects, I recently stumbled upon an interesting
> problem: as the cost of transferring a large data set from A to B is not
> negligible and someone has to bear that cost, what does "open" mean in case
> of "big data"?
> For example, Amazon Web Services offer a treasure trove of data sets, some
> on CC-BY or CC-BY-SA licenses:
>   http://aws.amazon.com/publicdatasets/
> Understandably, Amazon charges for data transfers out of its
> infrastructure.  When you rent Amazon's infrastructure in the same region
> in which the interesting data set is located, you're not charged for the
> transfer (but you are charged for the machines you use).
> In the recent rewrite of the Panton Principles website, initiated by
> Michelle Brook (http://goo.gl/cq1SuD) open research data is currently
> defined as "data [...] made available on the internet under licenses that
> permit anyone to download [...] without financial, legal, or technical
> barriers".
> The quoted sentence is careful to require lack of financial barriers only
> in the license, so charging for data transfers seems to be compatible with
> openness.
> A practical question: If, as a researcher or a research organization, I
> want to publish a large data set and keep the "open" label, can I charge
> for data transfers (plus amortization costs of data storage), or do I have
> to cover them myself?
> What are your thoughts?
> Best regards,
> Lukasz
> --
> Dr. Łukasz Bolikowski, Assistant Professor
> Centre for Open Science, ICM, University of Warsaw
> Contact details: http://www.icm.edu.pl/~bolo/
> _______________________________________________
> open-science mailing list
> open-science at lists.okfn.org
> https://lists.okfn.org/mailman/listinfo/open-science
> Unsubscribe: https://lists.okfn.org/mailman/options/open-science

Peter Murray-Rust
Reader in Molecular Informatics
Unilever Centre, Dep. Of Chemistry
University of Cambridge
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.okfn.org/pipermail/open-science/attachments/20140517/f5d16cb7/attachment-0003.html>

More information about the open-science mailing list