[School-of-data] Introduction -- Simon Cropper

Simon Cropper simoncropper at fossworkflowguides.com
Thu Jun 19 04:11:06 UTC 2014


Michael,

Sure, I would be interested. I have the bulk of this information at my 
fingertips at the present having done a comprehensive review of the 
resources available for myself. I have also found some documentation on 
how to integrate R with Pandas, and also how to do stuff in Pandas the 
same way as R.

I have also been actively reviewing the books available on the topic. 
Anyone interested can see my latest reviews here -- 
http://www.simonchristophercropper.com/TechnicalReviews.html
I am currently reviewing the soon to be published book on "Python for 
Finance" which utilizes Pandas to analyze financial data.

Where did you expect to publish this information? Who is the target 
audience. What time-lines have you in mind?

On 17/06/14 17:38, Michael Bauer wrote:
> Hi there,
>
> On Mon, Jun 16, 2014 at 08:32:42AM +0100, Peter Murray-Rust wrote:
>>> As I am exploring some new tools in Python, I have thought of doing this
>>> analysis using Pandas or something similar. The code would be integrated
>>> into iPython Notebooks so others could view the methodology and augment
>>> where necessary, and managed in a GitHub repository.
>>>
>>>
>> I'm not (yet?) an expert Pythonista but from the description of the problem
>> it sounds like you will need multivariate statistical methods. There are
>> lots of libraries - I would probably point you at R but Pandas points you
>> at http://statsmodels.sourceforge.net. I would probably start with a
>> Principal Components method to get an idea of the shape of the data - are
>> there serious outliers, etc. and then move to classification methods -
>> supervised and unsupervised, binary and multiple. You're almost certainly
>> going to have to deal with missing data .
>
> A while back we were thinking about introducing a more advanced framework
> for everyone who gets bored playing with spreadsheets ;) We were debating
> on R vs. Python (Although I'm a python programmer I did most of my data
> work in R (pandas didn't exist when I started out)). Would you want to
> write a short introduction on python/pandas. What you need to start out and
> where to find further resources?
>
> Michael
>

-- 
Cheers Simon

    Simon Cropper - Open Content Creator

    Free and Open Source Software Workflow Guides
    ------------------------------------------------------------
    Introduction               http://www.fossworkflowguides.com
    GIS Packages           http://www.fossworkflowguides.com/gis
    bash / Python    http://www.fossworkflowguides.com/scripting



More information about the school-of-data mailing list