[okfn-labs] [idea-rfc]: DataPipes - Streaming Online Data Transformation!

Rufus Pollock rufus.pollock at okfn.org
Mon May 6 14:49:21 UTC 2013


At last week's Open Data Maker Night here in London some of us [1] started
kicking around an idea we called *Data Pipes. *The basic pitch was [2]:

*Data Pipes would be a service to do streaming online data transformation.
Heavily inspired by unix shell with its pipes and utilities like cut, grep,
sed, sort, uniq etc. We want to work with streams so focus (initially) is
on CSV files.*

As a demonstration of the idea the barest prototype has been put together:

http://datapipes.okfnlabs.org/  -  (source code on
github<https://github.com/okfn/datapipes>
)

This is barely functional - there's just one working operation (delete) atm
- but there are plans for many
more<https://github.com/okfn/datapipes/issues/9>and i already like how
natural this feels in node.js.

Is this useful? Do people have tips (e.g. how best to stream post data in
node.js <https://github.com/okfn/datapipes/issues/5>)? Is anyone up for
contributing <https://github.com/okfn/datapipes/issues>?

Regards,

Rufus

[1]: specifically Ross Jones, James Smith, David Miller and myself. Plus,
from comments on IRC, I thik Friedrich (Lindenberg) had also been thinking
along similar lines!

[2]: the immediate motivation was a relatively non-tecchy participant at
the open data maker night who want to remove commas from amounts in a CSV
column before putting the data into OpenSpending. A common enough
requirement but one which would involve some spreadsheet-fu or scripting to
sort out. Why, we thought, shouldn't this just be a simple web-service ...
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.okfn.org/pipermail/okfn-labs/attachments/20130506/b56b6ae6/attachment-0001.html>


More information about the okfn-labs mailing list