[open-science] Big and open data – who should bear data transfer costs?
l.bolikowski at icm.edu.pl
Sat May 17 15:29:23 UTC 2014
On 05/17/2014 05:03 PM, P Kishor wrote:
> Talking of impractical, there is little reason one should be downloading
> 200 TB of raw data, and even less reason for many others wanting to do
> the same, if for no other reason than that not many have 200 TB of space
> lying around [...]
One reason I can think of it to create mirrors of popular resources in
research data centres all over the world and provide data analysis
services to local research communities (my organization is currently
building a research data centre with capabilities of storing and
analysing data sets of that volume).
> Nevertheless, recouping for cost above and beyond what may be budgeted
> as part of the mission of the offering org sounds justified. The bottom
> line is, nothing is free even if it is open, and seems free. The only
> thing we don't want is double-charging.
Very reasonable and practical approach.
I'm still not sure, though, how to classify data sets on AWS (open or
not?). If I were a for-profit company like Amazon, I would probably
provide financial incentives to use my infrastructure and discourage
transfers outside. Peter mentioned earlier "transparent and acceptable"
costs as requirements for openness. It's unrealistic to expect the
level of financial transparency from Amazon that would allow us to judge
whether data made available via AWS is "open". After all, IMHO there is
no social nor legal contract that would bind Amazon to disclose
financial details of their policy on data transfer charges.
Dr. Łukasz Bolikowski, Assistant Professor
Centre for Open Science, ICM, University of Warsaw
Contact details: http://www.icm.edu.pl/~bolo/
More information about the open-science