[School-of-data] Python, pandas and ipython notebooks
simoncropper at fossworkflowguides.com
Thu Jun 19 04:00:05 UTC 2014
Is it your intent to release the python notebooks as open content or
will it be part of a paid course?
On 17/06/14 19:57, Tony.Hirst wrote:
> FWIW, I've been putting together some notebooks around pandas for an in
> production OU course.
> We can't share all the notebooks atm (version & quality control issues!
> Plus things are still in an early stage of development) but I'll try to
> post fragments for comment...
> The audience is computing students, so the tone and some of the exercises
> reflect that...
> Here's a draft of a section on pandas series and dataframe structures:
> The notebook is intended to be embedded within other materials and used as
> a workbook (if you've ever used Stroud, you may feel the resemblance in
> part!); which means reading and working through the activities, running
> each cell as you come to it, then perhaps editing the cell you just ran
> and running it again.
> Any issues or comments, please add them to the tracker at
> [Michael - I'll try to do a notebook intro post for ScoDa blog this week]
> On 17/06/2014 08:38, "Michael Bauer" <michael.bauer at okfn.org> wrote:
>> Hi there,
>> On Mon, Jun 16, 2014 at 08:32:42AM +0100, Peter Murray-Rust wrote:
>>>> As I am exploring some new tools in Python, I have thought of doing
>>>> analysis using Pandas or something similar. The code would be
>>>> into iPython Notebooks so others could view the methodology and
>>>> where necessary, and managed in a GitHub repository.
>>> I'm not (yet?) an expert Pythonista but from the description of the
>>> it sounds like you will need multivariate statistical methods. There are
>>> lots of libraries - I would probably point you at R but Pandas points
>>> at http://statsmodels.sourceforge.net. I would probably start with a
>>> Principal Components method to get an idea of the shape of the data -
>>> there serious outliers, etc. and then move to classification methods -
>>> supervised and unsupervised, binary and multiple. You're almost
>>> going to have to deal with missing data .
>> A while back we were thinking about introducing a more advanced framework
>> for everyone who gets bored playing with spreadsheets ;) We were debating
>> on R vs. Python (Although I'm a python programmer I did most of my data
>> work in R (pandas didn't exist when I started out)). Would you want to
>> write a short introduction on python/pandas. What you need to start out
>> where to find further resources?
>> Data Diva | skype: mihi_tr | @mihi_tr
>> Open Knowledge | School of Data
>> http://okfn.org | http://schoolofdata.org
>> GPG/PGP key: http://tentacleriot.eu/mihi.asc
>> school-of-data mailing list
>> school-of-data at lists.okfn.org
>> Unsubscribe: https://lists.okfn.org/mailman/options/school-of-data
> -- The Open University is incorporated by Royal Charter (RC 000391), an exempt charity in England & Wales and a charity registered in Scotland (SC 038302). The Open University is authorised and regulated by the Financial Conduct Authority.
> school-of-data mailing list
> school-of-data at lists.okfn.org
> Unsubscribe: https://lists.okfn.org/mailman/options/school-of-data
Simon Cropper - Open Content Creator
Free and Open Source Software Workflow Guides
GIS Packages http://www.fossworkflowguides.com/gis
bash / Python http://www.fossworkflowguides.com/scripting
More information about the school-of-data