[School-of-data] Introduction -- Simon Cropper

Peter Murray-Rust pm286 at cam.ac.uk
Mon Jun 16 07:32:42 UTC 2014

On Mon, Jun 16, 2014 at 7:00 AM, Simon Cropper <
simoncropper at fossworkflowguides.com> wrote:

>  Hello everyone,
> This is an introduction to myself, my experience, my skills and my
> interests.

> I am a scientist with 20+ years experience in pushing-and-pulling
> biological, financial and retail data (now colloquially called data
> wrangling, data munging and data analysis  :-) ). I have programmed in
> both the Windows and Linux platforms, and used a variety of packages and
> software languages to wrangling the data I need into a usable format for
> further analysis. Currently looking for new opportunities (euphemism for
> out-of-work  :-( ) and looking for ways to keep my brain active.
> ...

> As I am exploring some new tools in Python, I have thought of doing this
> analysis using Pandas or something similar. The code would be integrated
> into iPython Notebooks so others could view the methodology and augment
> where necessary, and managed in a GitHub repository.
I'm not (yet?) an expert Pythonista but from the description of the problem
it sounds like you will need multivariate statistical methods. There are
lots of libraries - I would probably point you at R but Pandas points you
at http://statsmodels.sourceforge.net. I would probably start with a
Principal Components method to get an idea of the shape of the data - are
there serious outliers, etc. and then move to classification methods -
supervised and unsupervised, binary and multiple. You're almost certainly
going to have to deal with missing data .

> Bye-and-bye, I have been an active exponent of open source software, open
> content and open data. I am very interested in copyright and copyleft
> licensing as a means of ensuring information is readily available to the
> broader public.
> Excellent - have you met up with any OKF in Australia - Melbourne is an
active centre.

> --
> Cheers Simon
>    Simon Cropper - Open Content Creator
>    W: http://www.simonchristophercropper.com
>    Free and Open Source Software Workflow Guides
>    ------------------------------------------------------------
>    Introduction               http://www.fossworkflowguides.com
>    GIS Packages           http://www.fossworkflowguides.com/gis
>    bash / Python    http://www.fossworkflowguides.com/scripting
> _______________________________________________
> school-of-data mailing list
> school-of-data at lists.okfn.org
> https://lists.okfn.org/mailman/listinfo/school-of-data
> Unsubscribe: https://lists.okfn.org/mailman/options/school-of-data

Peter Murray-Rust
Reader in Molecular Informatics
Unilever Centre, Dep. Of Chemistry
University of Cambridge
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.okfn.org/pipermail/school-of-data/attachments/20140616/da87658d/attachment-0002.html>

More information about the school-of-data mailing list