[School-of-data] Python, pandas and ipython notebooks

Tony.Hirst tony.hirst at open.ac.uk
Thu Jun 19 08:39:33 UTC 2014


Simon

The notebooks will be part of a paid course, but if we can find ways of
releasiing things, I'd like to see as much as open as possible.

One approach I hope to discuss with the course team properly is to to have
'open data investigations' as part of the course, where we set a data
investigation style challenge in public and get students to engage with it
as part of the course.

I'll post updates to this list as and when I can.

FWIW, if you're interested in "OER" style materials around pandas and
IPythin notebooks, I'd appreciate having someone to bounce ideas around
with about how to work up more exercises using IPython blocks in context
of getting across ideas about data shapes, as well as things like table
merges, effects of pivot table operations etc. First proof of concept
here:
http://blog.ouseful.info/2014/03/26/visualising-pandas-dataframes-with-ipyt
honblocks-proof-of-concept/

tony

On 19/06/2014 05:00, "Simon Cropper" <simoncropper at fossworkflowguides.com>
wrote:

>Hi Tony,
>
>Is it your intent to release the python notebooks as open content or
>will it be part of a paid course?
>
>On 17/06/14 19:57, Tony.Hirst wrote:
>> FWIW, I've been putting together some notebooks around pandas for an in
>> production OU course.
>>
>> We can't share all the notebooks atm (version & quality control issues!
>> Plus things are still in an early stage of development) but I'll try to
>> post fragments for comment...
>>
>> The audience is computing students, so the tone and some of the
>>exercises
>> reflect that...
>>
>> Here's a draft of a section on pandas series and dataframe structures:
>>
>>http://nbviewer.ipython.org/github/psychemedia/ou-tm351/blob/master/noteb
>>oo
>> ks-RFC/Pandas%20Intro%20-%20RFC.ipynb
>>
>>
>> The notebook is intended to be embedded within other materials and used
>>as
>> a workbook (if you've ever used Stroud, you may feel the resemblance in
>> part!); which means reading and working through the activities, running
>> each cell as you come to it, then perhaps editing the cell you just ran
>> and running it again.
>>
>> Any issues or comments, please add them to the tracker at
>> https://github.com/psychemedia/ou-tm351/issues
>>
>> tony
>>
>> [Michael - I'll try to do a notebook intro post for ScoDa blog this
>>week]
>>
>> On 17/06/2014 08:38, "Michael Bauer" <michael.bauer at okfn.org> wrote:
>>
>>> Hi there,
>>>
>>> On Mon, Jun 16, 2014 at 08:32:42AM +0100, Peter Murray-Rust wrote:
>>>>> As I am exploring some new tools in Python, I have thought of doing
>>>> this
>>>>> analysis using Pandas or something similar. The code would be
>>>> integrated
>>>>> into iPython Notebooks so others could view the methodology and
>>>> augment
>>>>> where necessary, and managed in a GitHub repository.
>>>>>
>>>>>
>>>> I'm not (yet?) an expert Pythonista but from the description of the
>>>> problem
>>>> it sounds like you will need multivariate statistical methods. There
>>>>are
>>>> lots of libraries - I would probably point you at R but Pandas points
>>>> you
>>>> at http://statsmodels.sourceforge.net. I would probably start with a
>>>> Principal Components method to get an idea of the shape of the data -
>>>> are
>>>> there serious outliers, etc. and then move to classification methods -
>>>> supervised and unsupervised, binary and multiple. You're almost
>>>> certainly
>>>> going to have to deal with missing data .
>>>
>>> A while back we were thinking about introducing a more advanced
>>>framework
>>> for everyone who gets bored playing with spreadsheets ;) We were
>>>debating
>>> on R vs. Python (Although I'm a python programmer I did most of my data
>>> work in R (pandas didn't exist when I started out)). Would you want to
>>> write a short introduction on python/pandas. What you need to start out
>>> and
>>> where to find further resources?
>>>
>>> Michael
>>>
>>> --
>>> Data Diva | skype: mihi_tr | @mihi_tr
>>> Open Knowledge | School of Data
>>> http://okfn.org | http://schoolofdata.org
>>> GPG/PGP key: http://tentacleriot.eu/mihi.asc
>>> _______________________________________________
>>> school-of-data mailing list
>>> school-of-data at lists.okfn.org
>>> https://lists.okfn.org/mailman/listinfo/school-of-data
>>> Unsubscribe: https://lists.okfn.org/mailman/options/school-of-data
>>
>> -- The Open University is incorporated by Royal Charter (RC 000391), an
>>exempt charity in England & Wales and a charity registered in Scotland
>>(SC 038302). The Open University is authorised and regulated by the
>>Financial Conduct Authority.
>> _______________________________________________
>> school-of-data mailing list
>> school-of-data at lists.okfn.org
>> https://lists.okfn.org/mailman/listinfo/school-of-data
>> Unsubscribe: https://lists.okfn.org/mailman/options/school-of-data
>>
>
>
>--
>Cheers Simon
>
>    Simon Cropper - Open Content Creator
>
>    Free and Open Source Software Workflow Guides
>    ------------------------------------------------------------
>    Introduction               http://www.fossworkflowguides.com
>    GIS Packages           http://www.fossworkflowguides.com/gis
>    bash / Python    http://www.fossworkflowguides.com/scripting
>_______________________________________________
>school-of-data mailing list
>school-of-data at lists.okfn.org
>https://lists.okfn.org/mailman/listinfo/school-of-data
>Unsubscribe: https://lists.okfn.org/mailman/options/school-of-data

-- The Open University is incorporated by Royal Charter (RC 000391), an exempt charity in England & Wales and a charity registered in Scotland (SC 038302). The Open University is authorised and regulated by the Financial Conduct Authority.



More information about the school-of-data mailing list