[ddj] A question about Google Refine

Damien Brunon damien.brunon at gmail.com
Thu Feb 7 13:29:41 UTC 2013


Hi everyone!

I'm Damien, I work for @jplusplus_ and I need some help about Google Refine.

I'm trying to work on a database about health structures in France that are
dedicated to autist people.

I downloaded a database with every structures that work with mentally
handicapped people, I succeded in taking out those who are not linked with
autist people but I still got a problem.

The thing is, some structures work with autist but with other mentally
handicapped people. So in the table you got:

structure_activity1 ; structure_clients_type1 ; number_of_beds1 ;
structure_activity2 ; structure_clients_type2 ; number_of_beds2 etc.

In exemple that makes:

*Line 1) General education / Intellectual deficient people */ 6
*;*Professional education / autists / 24
*;* General education / autists / 20
Line 2) Professional education / autists / 12 *;* *Professional education /
Intelletual deficient people / 20*
Line 3) General Education / autists / 10 ; Professional education /
autistes / 24 ; *General Education / Intelletual deficient people / 8*

What I want to do with Google Refine is delete every cell that doesn't
concern autism (like the ones underligned) and at the end just have the
informations about autists with the number of places.

Until  then I didn't succed because every time I try to delete the things
I  don't want using "facet", I delete the whole column (wich deletes also
things I want to keep).

One solution would be to concatenate all the structure_clients_type cols
into a new column, but how could I then extract the number of beds that
only concern autism?

So if you can help me that would be great!

-- 
Damien Brunon
damien.brunon at gmail.com
@silveroux
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.okfn.org/pipermail/data-driven-journalism/attachments/20130207/05e7ba23/attachment.html>


More information about the data-driven-journalism mailing list