[okfn-labs] [idea-rfc]: DataPipes - Streaming Online Data Transformation!

Ross Jones ross at servercode.co.uk
Tue May 7 08:20:53 UTC 2013


Hi,

On 7 May 2013, at 09:05, Michael Bauer <michael.bauer at okfn.org> wrote:

> I do like the idea of unix pipes - however it's hard for me to imagine it
> working on a web level. Wouldn't this be just a transforming version of the
> data proxy? Where I have a source, several steps and an output?

There's a precedent for thinking that pipes can work on the web (http://www.webpipes.org/), I guess on the basis that they both work with streams (throughput and latency issues aside). Part of my thoughts on Rufus' original idea was to build a DSL that even though pulling data from remote locations, would do all of the processing locally, but I think I'm sold on trying this approach first (and I think there may already be more than one DSL for working with CSV/XSL etc).

> At one point we will reach the issue of running proprietary untrusted code
> on our server (with map, reduce and filter functions) - I'd propose to use
> a language that supports proper sandboxing underneath. (Not sure node.js
> does so). Also I don't like javascript as a data handling language: it's
> simply not designed for it (yes I'm a hopeless lamdahead). 

That's definitely an issue, and I was thinking of abusing the fact that ScraperWiki is open source to build a really lightweight version that is *just* about sandboxed code execution (I know the old codebase pretty well) - also it's a good excuse to play with docker.io ;).  

I still haven't 100% figured out how we'd cleanly stream the data through the sandbox (unless the whole app was inside it) but I've got the various parts floating around in my head, I just need more coffee and time to make the ideas more concrete.

Ross.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.okfn.org/pipermail/okfn-labs/attachments/20130507/07ae4cc8/attachment-0002.html>


More information about the okfn-labs mailing list