[School-of-data] Introduction -- Simon Cropper
simoncropper at fossworkflowguides.com
Thu Jun 19 04:11:06 UTC 2014
Sure, I would be interested. I have the bulk of this information at my
fingertips at the present having done a comprehensive review of the
resources available for myself. I have also found some documentation on
how to integrate R with Pandas, and also how to do stuff in Pandas the
same way as R.
I have also been actively reviewing the books available on the topic.
Anyone interested can see my latest reviews here --
I am currently reviewing the soon to be published book on "Python for
Finance" which utilizes Pandas to analyze financial data.
Where did you expect to publish this information? Who is the target
audience. What time-lines have you in mind?
On 17/06/14 17:38, Michael Bauer wrote:
> Hi there,
> On Mon, Jun 16, 2014 at 08:32:42AM +0100, Peter Murray-Rust wrote:
>>> As I am exploring some new tools in Python, I have thought of doing this
>>> analysis using Pandas or something similar. The code would be integrated
>>> into iPython Notebooks so others could view the methodology and augment
>>> where necessary, and managed in a GitHub repository.
>> I'm not (yet?) an expert Pythonista but from the description of the problem
>> it sounds like you will need multivariate statistical methods. There are
>> lots of libraries - I would probably point you at R but Pandas points you
>> at http://statsmodels.sourceforge.net. I would probably start with a
>> Principal Components method to get an idea of the shape of the data - are
>> there serious outliers, etc. and then move to classification methods -
>> supervised and unsupervised, binary and multiple. You're almost certainly
>> going to have to deal with missing data .
> A while back we were thinking about introducing a more advanced framework
> for everyone who gets bored playing with spreadsheets ;) We were debating
> on R vs. Python (Although I'm a python programmer I did most of my data
> work in R (pandas didn't exist when I started out)). Would you want to
> write a short introduction on python/pandas. What you need to start out and
> where to find further resources?
Simon Cropper - Open Content Creator
Free and Open Source Software Workflow Guides
GIS Packages http://www.fossworkflowguides.com/gis
bash / Python http://www.fossworkflowguides.com/scripting
More information about the school-of-data