[open-science] BioTorrents

Tom Moritz tom.moritz at gmail.com
Thu Apr 22 18:20:34 UTC 2010

Hi Rufus --
I have not tested iRods firsthand, when I was at Getty we were involved in
some trials
with them -- but given its origins at SDSCC, I'm sure you're right that it
is  probably
"non-trivial"  in terms of rapid and easy deployment (!) -- but perhaps
their trial
approaches may  be instructive in informing future developments...?

In the same spirit, I mentioned Tera-grid and Open Science as projects/
with substantial NSF support but -- again -- connections to those projects
-- and
how they are addressing the problems you've posed may prove fruitful
in analysis and development?

 I think the notion of a developing a full  "specification/ requirements
document is very useful step
 -- perhaps we should all consider your list and see if we can review it for
 and adequacy to our common purposes?

(I think as John suggests bio-med and genomics -- not areas of primary focus
for me personally --
have made important progress toward standards for transparency,
interoperability and meta-analysis.
After the recent  "Climate-gate"/IPCC  events,  I am hopeful that we can
better prototype
best practices in applied sciences and consider *the complete data
cycle*from data creation not to
deposit/curation (often the end point in descirptions of the "data cycle"but
to "data as evidence"
for policy formation and decision-support. It is specifically the
requirement of delivering data
with full transparency-accountability (lineage/provenance)
in ways that are quickly understandable by policy makers and decision makers
more easily subject to review and testing that I believe challenges us...?)


Tom Moritz
+1 310 963 0199 (cell)
tommoritz (Skype)
“Πάντα ῥεῖ καὶ οὐδὲν μένει” (Everything flows, nothing stands still.)
Please consider the environment before printing this email.
Tom Moritz
1968 1/2 South Shenandoah Street,
Los Angeles, California 90034-1208  USA
+1 310 963 0199 (cell)
tommoritz (Skype)

“Πάντα ῥεῖ καὶ οὐδὲν μένει” (Everything flows, nothing stands still.)

Please consider the environment before printing this email.

On Thu, Apr 22, 2010 at 4:07 AM, Rufus Pollock <rufus.pollock at okfn.org>wrote:

> On 17 April 2010 18:19, Tom Moritz <tom.moritz at gmail.com> wrote:
> > Sorry to be just jumping in without tracking regularly -- but there are
> some
> > Stateside projects seeking to solve the problem that Rufus describes --
> if
> > I'm not mistaken this is precisely what iRods (formerly at San Diego
> > Supercomputer Center now migrated to Univ of North Carolina) has set out
> to
> > address? [SEE: https://www.irods.org/ ]  and the NSF supported Tera-Grid
> has
> Thanks a lot for these links Tom. I'd seen iRods before (I've also
> just added it to [1]). When I last tried it out 1y+ ago the install
> was non-trivial and I couldn't imagine getting a "volunteer" grid
> going on this basis (it was unlikely the average dedicated server
> owner was going to get through that setup!).
> [1]: <http://wiki.okfn.org/p/Distributed_Storage/Research>
> Perhaps it is worth explicitly listing the requirements we put
> together (listed on [1]):
>  1. Robustness via replication across nodes
>  2. Easy addition of nodes -- in particular we wish people to be able
> to easily "donate" nodes
>  3. Require good share/shard-rebalancing as nodes enter and leave network
>  4. The system should be able to handle small and very large files
> (so files should be automatically sharded)
>  5. Concurrency/Consistency is not a big issue
>  6. Availability is a big issue
>  7. Versioning would be nice (though what exactly would this mean?)
>  8. Data stored would be open so encryption/privacy is not a priority
> > been grappling with this as well: SEE: https://www.teragrid.org/ ]
> > and similarly the Open Science Grid [SEE:
> >
> http://www.opensciencegrid.org/About/News_Archive/Open_Science_Grid_Receives_30_Million_Dollar_Award
> > ]
> These both seem to be services rather than "software" -- I may have
> missed something but I couldn't see that the software behind e.g. open
> science grid open source and available for download?
> > I have been in some discvussions in past weeks and months with UNFCC, US
> > EPA, and others about how best to manage at least foundational data sets
> > ("canonical"?) while providing precisely the level of transparency and
> > accountability that was obviously necessary in the recent IPCC
> dust-up...  I
> > believe that we may be best off picking certain such data and thoroughly
> > modeling best practice...???
> Indeed. I certainly think it would be interesting to find out more
> what is on offer and *in particular* people's actual experience using
> that software or service -- perhaps updates to
> <http://wiki.okfn.org/p/Distributed_Storage/Research>.
> Rufus
> --
> Open Knowledge Foundation
> Promoting Open Knowledge in a Digital Age
> http://www.okfn.org/ - http://blog.okfn.org/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.okfn.org/pipermail/open-science/attachments/20100422/c723f6b1/attachment-0001.html>

More information about the open-science mailing list