[okfn-labs] [idea-rfc]: DataPipes - Streaming Online Data Transformation!

Rufus Pollock rufus.pollock at okfn.org
Wed May 8 17:33:44 UTC 2013


On 7 May 2013 09:20, Ross Jones <ross at servercode.co.uk> wrote:
> Hi,
>
> On 7 May 2013, at 09:05, Michael Bauer <michael.bauer at okfn.org> wrote:
>
>> I do like the idea of unix pipes - however it's hard for me to imagine it
>> working on a web level. Wouldn't this be just a transforming version of the
>> data proxy? Where I have a source, several steps and an output?
>
> There's a precedent for thinking that pipes can work on the web
> (http://www.webpipes.org/), I guess on the basis that they both work with

Nice link - had not come across this. I should also shout-out here to
Max Ogden's gut proposal for doing data transformation (that's very
webhook-y).

> streams (throughput and latency issues aside). Part of my thoughts on Rufus'
> original idea was to build a DSL that even though pulling data from remote
> locations, would do all of the processing locally, but I think I'm sold on
> trying this approach first (and I think there may already be more than one
> DSL for working with CSV/XSL etc).

:-) - this is an experiment!

>> At one point we will reach the issue of running proprietary untrusted code
>> on our server (with map, reduce and filter functions) - I'd propose to use
>> a language that supports proper sandboxing underneath. (Not sure node.js
>> does so). Also I don't like javascript as a data handling language: it's
>> simply not designed for it (yes I'm a hopeless lamdahead).
>
> That's definitely an issue, and I was thinking of abusing the fact that
> ScraperWiki is open source to build a really lightweight version that is
> *just* about sandboxed code execution (I know the old codebase pretty well)
> - also it's a good excuse to play with docker.io ;).
>
> I still haven't 100% figured out how we'd cleanly stream the data through
> the sandbox (unless the whole app was inside it) but I've got the various
> parts floating around in my head, I just need more coffee and time to make
> the ideas more concrete.

Would like to here more once coffee has done its work!

Rufus




More information about the okfn-labs mailing list