[open-science] the early-career guide to doing open science?

Carl Boettiger cboettig at gmail.com
Fri Mar 16 16:32:11 UTC 2012


Hi Tom,

It sounds like you know you're way around tools and webservers, and
capacity/cost is your primary challenge?  Not sure where you're based, but
I would contact <http://www.nersc.gov/about/contact-us/> a computing
resources center like the US Dept of Energy's NERSC center.  They can have
great support for sharing
data<http://www.nersc.gov/users/data-and-networking/sharing-data/>and
the best
capacity/backup security/support
<http://www.nersc.gov/users/data-and-networking/hpss/about/>you'll find.
They have grants that award computing time and access, so I would write to
them to discuss your needs.  As a graduate student and with your goals, I
suspect you'd have no trouble getting a system.  I've found their support
to be responsive and helpful.

I'd also recommend looking at the Merritt
Archive<http://www.cdlib.org/services/uc3/merritt/>provided by UC
digital libraries.

My own system is based largely around github for sharing/collaborating on
code, figures, etc.  I don't tend to put raw data files under version
control systems, and rather try to follow the recommendations of Software
Carpentry for data <http://software-carpentry.org/4_0/data/mgmt/> (briefly:
version manage the readmes). I could imagine you might want to keep a copy
of all your code on github, and just link the data files to a
larger-capacity server (such as I mention above) from the README.md files,
which are also version managed on github and nicely displayed. This gives
easy access to the data.

If you go the github route, I'd write to github support and discuss what
you envision -- they are very invested in promoting this kind of thing and
have made allowances in the past for more space, managing binaries, etc.

Figshare.org also has excellent support for archiving data and figures, you
could link to them from the version-managed github readmes.

While we could always improve on the technical end, I believe there's
reasonably good infrastructure to support what you propose to do at minimal
cost -- the primary challenges are finding people with a will to do it, and
to a lesser extent the knowhow.

Thanks for your question.  It sounds like you already bring a lot of
expertise from your previous experiences, so I'd be curious to hear what
solution you settle on.

-Carl

On Fri, Mar 16, 2012 at 11:40 AM, Peter Murray-Rust <pm286 at cam.ac.uk> wrote:
> Greetings Tom,
>
> I understand and empathize with your problem. Although I am in an
> institution this isn't a huge amount of help. My own "solution" so far has
> been to run my own server (financed by grant money, for sure). But the
main
> problem is that maintaining this oneself is costly in time and expertise.
I
> am very lucky in the people I have had in my group. They have implemented,
> for example, a Jenkins (Hudson) Continuous integration system
> (http://hudson.ch.cam.ac.uk) But, since I have closed down my group its
will
> inevitably decay.
>
> Universities aren't the best places for this as they are increasingly
> predicated on competition. For example in most of my infrastructure I can
> get this from the OKF - wikis, etherpads, etc. And there is a group of
> volunteers who will help with the technology.
>
> Data is a real, objective problem. I just heard that Tranche (U Mich) is
> finding difficulty staying alive. Bioinformatics has huge amounts of
public
> money and uses it very well. Outside that there is Dryad (but coupled to
> publications) and Figshare.
>
> I am not a supporter of Institutional repositories for data. They are
> library-oriented and extremely scattered. I favour national libraries
(e.g.
> the British Library in the UK).
>
> This triggers me to ask whether the OKF might not seek public funding for
a
> data repository for science, maybe in conjunction with a national library?
>
>
>
> --
> Peter Murray-Rust
> Reader in Molecular Informatics
> Unilever Centre, Dep. Of Chemistry
> University of Cambridge
> CB2 1EW, UK
> +44-1223-763069
>
> _______________________________________________
> open-science mailing list
> open-science at lists.okfn.org
> http://lists.okfn.org/mailman/listinfo/open-science
>



-- 
Carl Boettiger
UC Davis
http://www.carlboettiger.info/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.okfn.org/pipermail/open-science/attachments/20120316/7559276e/attachment-0001.html>


More information about the open-science mailing list