[okfn-labs] [idea-rfc]: DataPipes - Streaming Online Data Transformation!

Michael Bauer michael.bauer at okfn.org
Tue May 7 08:05:24 UTC 2013


Hi,

Currently datapipes.okfn.org leads me to okfn.org/register

I do like the idea of unix pipes - however it's hard for me to imagine it
working on a web level. Wouldn't this be just a transforming version of the
data proxy? Where I have a source, several steps and an output?

Will you allow me to use my own scripting to define new elements in
pipes? e.g. a new transformation in the pipeline? 

At one point we will reach the issue of running proprietary untrusted code
on our server (with map, reduce and filter functions) - I'd propose to use
a language that supports proper sandboxing underneath. (Not sure node.js
does so). Also I don't like javascript as a data handling language: it's
simply not designed for it (yes I'm a hopeless lamdahead). 

Right now the pipe is simply one block, combining multiple pipes seems
painfull - can we make this easier? 

Michael

On Mon, May 06, 2013 at 03:49:21PM +0100, Rufus Pollock wrote:
> At last week's Open Data Maker Night here in London some of us [1] started
> kicking around an idea we called *Data Pipes. *The basic pitch was [2]:
> 
> *Data Pipes would be a service to do streaming online data transformation.
> Heavily inspired by unix shell with its pipes and utilities like cut, grep,
> sed, sort, uniq etc. We want to work with streams so focus (initially) is
> on CSV files.*
> 
> As a demonstration of the idea the barest prototype has been put together:
> 
> http://datapipes.okfnlabs.org/  -  (source code on
> github<https://github.com/okfn/datapipes>
> )
> 
> This is barely functional - there's just one working operation (delete) atm
> - but there are plans for many
> more<https://github.com/okfn/datapipes/issues/9>and i already like how
> natural this feels in node.js.
> 
> Is this useful? Do people have tips (e.g. how best to stream post data in
> node.js <https://github.com/okfn/datapipes/issues/5>)? Is anyone up for
> contributing <https://github.com/okfn/datapipes/issues>?
> 
> Regards,
> 
> Rufus
> 
> [1]: specifically Ross Jones, James Smith, David Miller and myself. Plus,
> from comments on IRC, I thik Friedrich (Lindenberg) had also been thinking
> along similar lines!
> 
> [2]: the immediate motivation was a relatively non-tecchy participant at
> the open data maker night who want to remove commas from amounts in a CSV
> column before putting the data into OpenSpending. A common enough
> requirement but one which would involve some spreadsheet-fu or scripting to
> sort out. Why, we thought, shouldn't this just be a simple web-service ...

> _______________________________________________
> okfn-labs mailing list
> okfn-labs at lists.okfn.org
> http://lists.okfn.org/mailman/listinfo/okfn-labs
> Unsubscribe: http://lists.okfn.org/mailman/options/okfn-labs


-- 
Data Wrangler with the Open Knowledge Foundation (OKFN.org)
GPG/PGP key: http://tentacleriot.eu/mihi.asc
Twitter: @mihi_tr Skype: mihi_tr




More information about the okfn-labs mailing list