[open-science] Planning for the cost of free

Jason Priem jason at jasonpriem.com
Sat Dec 17 20:32:15 UTC 2011

I think part of the problem here may be in failing to properly see this 
as a problem of marginal accounting. The money spent openly 
disseminating  data is in most cases a tiny fraction of that spent 
gathering it in the first place. The marginal cost of openness is quite low.

However, the payoff of actually letting people see your valuable data 
(both for its creators and for society at large) is quite large, 
compared to the payoff of letting it moulder in your closet. Openness' 
marginal benefit is very high.

Since open dissemination turns you a tidy marginal profit, then, the 
extra scratch is well spent, and the rational actor's choice. Of course, 
as others point out, even more rational is to let a specialist 
repository take care of all this.

On 12/17/2011 12:29 PM, Peter Murray-Rust wrote:
> On Sat, Dec 17, 2011 at 2:56 PM, Puneet Kishor <punkish at eidesis.org
> <mailto:punkish at eidesis.org>> wrote:
>     Like all of us on this list, I too have been thinking about OA for a
>     long time, but the recent question about the link between OA, data
>     mining, sinking servers, and hence, a possible need for an exception
>     in the legal obligations has brought to mind once again the
>     mechanism for enabling OA.
>     The desire to be open has to be supported by the capability to be
>     so. Every time I put out a link to one of my data applications on
>     any one of the various programming lists, I can see the process
>     monitor spike up as my server is hit by a barrage of queries. I have
>     unlimited bandwidth, and a pretty capable, top of the specs
>     computer, but it does start sweating and breaking stride. So, it
>     makes sense that if I were using and creating hundreds of GB of
>     data, and serving processes that were very compute-intensive, I
>     wouldn't be hiding all the data behind a pokey dial-up line and a 5
>     year old computer with an incapable operating system.
> I think it's worth distinguishing OA for documents and document-sized
> data and OpenData with hundreds of gigabytes per project. Also
> distinguish between "final snapshots" and ongoing dynamic data.
> I doubt that publishing manuscripts, diagrams, excel spreadsheets etc.
> is a major hit. And I suspect that most institutional repositories would
> be delighted to have something to put in their pot. An *institution*
> that canot manage shouldn't be in the game. I agree there is a potential
> problem for people without institutions but I doubt this is a problem.
> It's more difficult when there are zillions of files or they are huge.
> Again if these are in anyway institution-related you should try them.
> Having said that I don't think data should generally be at institutions
> and this is a problem that the domain or the country should solve.
> Dryad, Tranche, etc. are doing a good job of looking after data. I think
> there has been talk of a modest charge. Data costs money, but given the
> emphasis by research funders solutions should be emerging
>     But, what if I didn't build into my project the cost of making OA
>     possible? As I like to say, free is very expensive. If all of the
>     project funds are devoted to science, there will be no money to make
>     OA possible, and conversely, if all the funds are devoted to OA,
>     there will be no money for research. There is a sweet spot somewhere
>     that balances the funds between doing research and making its data
>     and results available freely to everyone. Some of us call it the
>     Warnick Curve (another story).
> Many funders are asking for data management plans at time of grant
> submission and so this can be costed out as a proportion. I agree it's
> hard at present because there is no clear economy for this but it will
> emerge. (I do not see conventional publishers as being the solution -
> generally they have no expertise in data management).
>     So, the question -- can we genuinely plead inability to make OA
>     possible because we have inadequate capacity?
> You sound as if you *want* to plead incapacity!
>     Or, we should have built in the cost of OA into the research
>     proposal in the first place so we couldn't hide behind such an excuse?
> Yes
>     In other words, we have no business doing complicated research with
>     lots of data using public monies unless we also think of the
>     continuing costs of making it all available.
> Yes. It will be hard at first. This is an area where IMO the funders and
> the institutions need to be ahead of the researchers and make it easy
> for them. Personally I think national libraries are a better place to
> develop this than universities
>     Of course, we have to define "available," but that is another rant.
> If it's institutional or national then it's their responsibility to
> provide uptime.
>     --
>     Puneet Kishor http://punkish.org
>     science http://earth-base.org
>     advocacy http://creativecommons.org
>     _______________________________________________
>     open-science mailing list
>     open-science at lists.okfn.org <mailto:open-science at lists.okfn.org>
>     http://lists.okfn.org/mailman/listinfo/open-science
> --
> Peter Murray-Rust
> Reader in Molecular Informatics
> Unilever Centre, Dep. Of Chemistry
> University of Cambridge
> CB2 1EW, UK
> +44-1223-763069
> _______________________________________________
> open-science mailing list
> open-science at lists.okfn.org
> http://lists.okfn.org/mailman/listinfo/open-science

More information about the open-science mailing list