[okfn-labs] [idea-rfc]: DataPipes - Streaming Online Data Transformation!

Lucy Chambers lucy.chambers at okfn.org
Mon May 6 19:05:26 UTC 2013


Hi Rufus,

I like the sentiment, I'm just wondering whether it would be easier to use
for a non-techie (if that is indeed your audience) than e.g. Open Refine to
do the same thing (e.g. removing commas)?

Or is the point that this would be automated so that you could run common
transformations automatically (e.g. without having to know commands in Open
Refine)?

Apologies if I've missed the point - not familiar with pipes :)

Perhaps a concrete example would help, and as I'm currently writing up an
ecosystem of tools for working with spending data, I'd be keen to offer up
spending as one if that would work!

Lucy






On 6 May 2013 12:19, Emanuil Tolev <emanuil at cottagelabs.com> wrote:

> Hi Rufus,
>
> Anything like http://pipes.yahoo.com/pipes/ ? (Note: I haven't had time
> to use it yet, so can't vouch for suitability, but it seems like the right
> thing.)
>
> I would be glad to see integratable components as well (but I like the
> streaming data idea).
> They probably exist, but mostly don't seem to match exactly what I'm
> looking for to do a specific job quickly, and then things like
> https://github.com/CottageLabs/metadata-enhancement/blob/master/csv_utils.pyoccur, and clearly many people need to do similar tasks :).
>
>
> Greetings,
> Emanuil
>
>
> On 6 May 2013 15:49, Rufus Pollock <rufus.pollock at okfn.org> wrote:
>
>> At last week's Open Data Maker Night here in London some of us [1]
>> started kicking around an idea we called *Data Pipes. *The basic pitch
>> was [2]:
>>
>> *Data Pipes would be a service to do streaming online data transformation.
>> Heavily inspired by unix shell with its pipes and utilities like cut,
>> grep, sed, sort, uniq etc. We want to work with streams so focus
>> (initially) is on CSV files.*
>>
>> As a demonstration of the idea the barest prototype has been put together:
>>
>> http://datapipes.okfnlabs.org/  -  (source code on github<https://github.com/okfn/datapipes>
>> )
>>
>> This is barely functional - there's just one working operation (delete)
>> atm - but there are plans for many more<https://github.com/okfn/datapipes/issues/9>and i already like how natural this feels in node.js.
>>
>> Is this useful? Do people have tips (e.g. how best to stream post data
>> in node.js <https://github.com/okfn/datapipes/issues/5>)? Is anyone up
>> for contributing <https://github.com/okfn/datapipes/issues>?
>>
>> Regards,
>>
>> Rufus
>>
>> [1]: specifically Ross Jones, James Smith, David Miller and myself. Plus,
>> from comments on IRC, I thik Friedrich (Lindenberg) had also been thinking
>> along similar lines!
>>
>> [2]: the immediate motivation was a relatively non-tecchy participant at
>> the open data maker night who want to remove commas from amounts in a CSV
>> column before putting the data into OpenSpending. A common enough
>> requirement but one which would involve some spreadsheet-fu or scripting to
>> sort out. Why, we thought, shouldn't this just be a simple web-service ...
>>
>> _______________________________________________
>> okfn-labs mailing list
>> okfn-labs at lists.okfn.org
>> http://lists.okfn.org/mailman/listinfo/okfn-labs
>> Unsubscribe: http://lists.okfn.org/mailman/options/okfn-labs
>>
>>
>
> _______________________________________________
> okfn-labs mailing list
> okfn-labs at lists.okfn.org
> http://lists.okfn.org/mailman/listinfo/okfn-labs
> Unsubscribe: http://lists.okfn.org/mailman/options/okfn-labs
>
>


-- 
*Project Coordinator*
School of Data <http://schoolofdata.org/> and
OpenSpending <http://openspending.org/>
Projects of the Open Knowledge Foundation <http://okfn.org/>
Support our work <http://okfn.org/support/>.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.okfn.org/pipermail/okfn-labs/attachments/20130506/aad85a50/attachment-0002.html>


More information about the okfn-labs mailing list