[open-science] the early-career guide to doing open science?

Tom Roche Tom_Roche at pobox.com
Fri Mar 16 16:34:15 UTC 2012


Tom Roche Fri, 16 Mar 2012 11:02:57 -0400
>>>>> are there guides to, e.g., archiving and enabling access to
>>>>> science inputs and outputs? esp for the under-resourced,
>>>>> early-career scientist-in-training.
...
>>>>> I'm a former software engineer, now a graduate student in
>>>>> atmospheric [modeling].

Stacy Konkiel Fri, 16 Mar 2012 11:11:51 -0400 (rearranged)
>>>> Are you strictly computer science,

Not since undergrad.

>>>> does your research overlap with a subject area that has its own
>>>> subject repositor(y/ies) (such as astrophysics)?

You mean arXiv? I'm not aware of anything like that in "environmental
science" or modeling. I would dearly like to know of one: please pass
pointers to any candidates of which you are aware! 

BTW: arXiv only does preprints, not data, no? (I am vastly ignorant
regarding open science, unfortunately :-(

>>>> Have you approached your university's institutional repository?

Frankly I've never heard of one, certainly not one ...

>>>> as feature rich as we've become accustomed to

... with anything like the functionality available from the big free
DVCS (e.g., bitbucket, github) or devsites (e.g., *forge, code.google).

>>>> in some cases, [they] will provide large-scale file storage for free
>>>> (no matter the file type).

<bitter chuckle/> Currently I'm keeping my data where I contract, since
my school starts charging for backup tapes @ 100 GB, if memory serves.

>>>> E-Science Librarian
>>>> Indiana University

I'll definitely start looking for local e-science workers. A quick
google is finding only presentations and statements of intent, but
there is a Library Science program, which I'll ping.

Peter Murray-Rust Fri, 16 Mar 2012 15:40:15 +0000
>>> Data is a real, objective problem.

I'd say, "the" problem in this domain, since the other functionality
you describe (e.g., CI, wikis) can be gotten from the free providers
(examples above) ... *except* "the big disk" for archiving "big data."

Mark Hahnel Fri, 16 Mar 2012 15:43:05 +0000
>> [figshare.com currently offers] unlimited public space for your
>> research data, with version history. We will start testing an api
>> this week for pulling and pushing data in to the system.

I'll definitely be checking out figshare!

>>> Institutional repositories for data [tend to be] library-oriented
>>> and extremely scattered. I favour national libraries (e.g. the
>>> British Library in the UK).

Unfortunately, at my point in (what might be) my career (if I'm
lucky), I'm a beggar, not a chooser.

Stacy Konkiel Fri, 16 Mar 2012 12:10:05 -0400
> would [resources] be better spent partnering with various libraries
> and IRs to improve existing repositories, to better serve the needs of
> scientists? There's a large gap, as you pointed out, between what
> purpose library IRs serve and those built by scientists.

Given my admitted ignorance, there may be problems with the following
proposal, nonetheless: I'd suggest partnering with one or more of the
free DVCS/devsites above. AFAICS they provide most of the required
services (what am I missing? probably a lot :-) except "big data"
archiving. (And they provably scale.) So why not provide the latter
service(s) and tie them to an existing frontend and set of services?

Your assistance is appreciated! Tom Roche <Tom_Roche at pobox.com>




More information about the open-science mailing list