[okfn-discuss] [okfn-coord] question from the Twitter gallery about Open Data Grid

Rufus Pollock rufus.pollock at okfn.org
Sun Oct 3 13:16:01 UTC 2010


On 2 October 2010 19:25, Jo Walsh <metazool at gmail.com> wrote:
> http://grid.okfn.org/
>
> << @nicolastorzec: I'm wondering what's the status/future of @okfn Open Data
> Grid project (open distributed storage grid for #opendata) >>

Thanks a lot for noticing this Jo and pinging it over. I've put a
status notice on <http://grid.okfn.org/> to clarify this:

<quote>
June 2010: after a year of experimentation the Open Data Grid has been
disabled due to technical issues (specifically lack of storage
accounting). We are currently working on alternative distributed
storage backends and actively seeking donors of storage nodes. If you
are interested in getting involved please see the Project Wiki Page:
<http://wiki.okfn.org/p/Distributed_Storage>
</quote>

> I wonder this too! The last time i asked, recall hearing,
> "needs patches to Tahoe which needs developer effort".

Yes, the huge issue with Tahoe is the lack of 'usage' accounting
without which is is very hard to control usage of the grid (and hence
prevent overloading -- as happened to us at one point -- or usage of
open data grid for non-open material).

> Wondering if Tahoe is still the way to go, or if there are other
> *distributed* storage for open data services that it'd be possible to
> partner with and pursue the Open Data Grid concept that way?

Without storage accounting I don't think Tahoe is usable. I also think
that given our complete lack of need (or desire) for encryption the
fact this is so central to Tahoe is a bit of a minus.

There has been some recent discussion about alternative backends:

* mongodb + gridfs:
<http://lists.okfn.org/pipermail/okfn-help/2010-June/000662.html>
* riak <http://www.basho.com/Riak.html> - could be used similar to mongodb
* Eucalytus Walrus: <http://open.eucalyptus.com/wiki/EucalyptusStorage_v1.4>

I'm personally very interested in using riak (like mongodb it is
actually designed for classic keyvalue so you could also use it for
structured storage). I note in fact that recently Ben O'Steen,
Friedrich and I put together a bucket/object storage library called
OFS that provides a storage API to Riak (and to S3, archive.org etc)
and where we plan to implement chunking:

<http://bitbucket.org/okfn/ofs/src>

To turn this into a running grid right now we need:

* to dedicate some time to getting an experimental grid set up.
* to sort out chunking (i.e. splitting up large files)

Anyone interested in contributing on this would be welcomed with open arms!

> Or if there are clear requirements for what needs patched, is this something
> to propose OKF invests a small amount in to buy developer time from someone
> already in the Tahoe community?

I think investigating these other alternatives would be a better use
of people's time right now.

> If i'd be better asking these questions somewhere else, please point me
> there...

I think either okfn-help or okfn-discuss (in cc!) would be better for
this as it is a general interest topic. Also the 'open data grid' does
have a project page ;)

<http://wiki.okfn.org/p/Distributed_Storage/>

which indicates it is still at the incubating stage and suggests
okfn-help as main mailing list (did suggest okfn-discuss until just
now when I edited it!) and that project has the infrastructure working
group as a 'parent'.

Rufus
-- 
Open Knowledge Foundation
Promoting Open Knowledge in a Digital Age
http://www.okfn.org/ - http://blog.okfn.org/




More information about the okfn-discuss mailing list