[okfn-labs] Datapackage + Bubbles Demo

Rufus Pollock rufus.pollock at okfn.org
Fri Feb 21 14:55:33 UTC 2014


On 20 February 2014 00:58, Stefan Urbanek <stefan.urbanek at gmail.com> wrote:

> Hi,
>
> Here is a short demo of Bubbles[1] using Data Package collection store:
>
> https://gist.github.com/Stiivi/9104719
>

This is fantastic! Not only is this directly useful but also great example
of the tooling integration we want more of.

I've just created a wiki page here:
https://github.com/okfn/data.okfn.org/wiki where we can list examples like
this (and then migrate them as merited to data.okfn.org main site - e.g. at
http://data.okfn.org/tools)

In the Gist you will find the example python code, list of required
> datasets and their modifications and also stripped example output.
>
> The example is artificial, but at least shows:
>
> * how datapackage store is used - how to access datapackage resources as
> data objects
> * how Pipeline is constructed
> * simple master-detail join
> * aggregation with composite key
>
> The "Data Package collection store" is a directory with datapackages in
> it. Data objects are named "PACKAGE.RESOURCE", if package has only one
> resource then just "PACKAGE".
>

I note that in the spec for dpm (data package manager tool) installed data
packages go in a subdirectory `datapackages` - see
https://github.com/okfn/dpm/issues/3 for more.


> Note that if the same code was run on top of a SQL database source, then
> SQL queries (or maybe just one in this case) would be composed and executed
> instead of Python iterators. Transparently.
>

So you don't actually load the data package into a DB to do this? That's
interesting - it would also be nice to autoload data packages into the
relational DB (see relational databases in http://data.okfn.org/roadmap)


> My observation during the development: The Data Package and Simple Data
> Format is great. It just needs a bit refinement and confrontation with real
> uses (by tools, not human eyes). It needs to focus more on
> machine-processable and easy-to-use metadata.
>

That's good to hear and there are definitely improvements to be made. This
kind of usage is exactly what helps improve!

Rufus


> Comments, questions and suggestions are welcome.
>
> Enjoy,
>
> Stefan Urbanek
>
> [1] https://github.com/stiivi/bubbles
>
> *Twitter:* @Stiivi
> *Personal:* stiivi.com
> * Data Brewery:* databrewery.org
>
>
> _______________________________________________
> okfn-labs mailing list
> okfn-labs at lists.okfn.org
> https://lists.okfn.org/mailman/listinfo/okfn-labs
> Unsubscribe: https://lists.okfn.org/mailman/options/okfn-labs
>
>


-- 


*Rufus PollockFounder and Executive Director | skype: rufuspollock |
@rufuspollock <https://twitter.com/rufuspollock>The Open Knowledge
Foundation <http://okfn.org/>Empowering through Open
Knowledgehttp://okfn.org/ <http://okfn.org/> | @okfn
<http://twitter.com/OKFN> | OKF on Facebook
<https://www.facebook.com/OKFNetwork> |  Blog <http://blog.okfn.org/>  |
 Newsletter <http://okfn.org/about/newsletter>*
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.okfn.org/pipermail/okfn-labs/attachments/20140221/cacb17a3/attachment-0004.html>


More information about the okfn-labs mailing list