[School-of-data] Python, pandas and ipython notebooks

dan mcquillan d.mcquillan at gold.ac.uk
Thu Jun 19 08:25:31 UTC 2014


i see that CrisisNET http://blog.crisis.net/, the interesting new
ushahidi project, is promoting pandas as a way to wrangle their data
http://blog.crisis.net/get-crisis-data-with-python-and-pandas/

dan

On 17/06/14 10:57, Tony.Hirst wrote:
> FWIW, I've been putting together some notebooks around pandas for an in
> production OU course.
> 
> We can't share all the notebooks atm (version & quality control issues!
> Plus things are still in an early stage of development) but I'll try to
> post fragments for comment...
> 
> The audience is computing students, so the tone and some of the exercises
> reflect that...
> 
> Here's a draft of a section on pandas series and dataframe structures:
> http://nbviewer.ipython.org/github/psychemedia/ou-tm351/blob/master/noteboo
> ks-RFC/Pandas%20Intro%20-%20RFC.ipynb
> 
> 
> The notebook is intended to be embedded within other materials and used as
> a workbook (if you've ever used Stroud, you may feel the resemblance in
> part!); which means reading and working through the activities, running
> each cell as you come to it, then perhaps editing the cell you just ran
> and running it again.
> 
> Any issues or comments, please add them to the tracker at
> https://github.com/psychemedia/ou-tm351/issues
> 
> tony
> 
> [Michael - I'll try to do a notebook intro post for ScoDa blog this week]
> 
> On 17/06/2014 08:38, "Michael Bauer" <michael.bauer at okfn.org> wrote:
> 
>> Hi there,
>>
>> On Mon, Jun 16, 2014 at 08:32:42AM +0100, Peter Murray-Rust wrote:
>>>> As I am exploring some new tools in Python, I have thought of doing
>>> this
>>>> analysis using Pandas or something similar. The code would be
>>> integrated
>>>> into iPython Notebooks so others could view the methodology and
>>> augment
>>>> where necessary, and managed in a GitHub repository.
>>>>
>>>>
>>> I'm not (yet?) an expert Pythonista but from the description of the
>>> problem
>>> it sounds like you will need multivariate statistical methods. There are
>>> lots of libraries - I would probably point you at R but Pandas points
>>> you
>>> at http://statsmodels.sourceforge.net. I would probably start with a
>>> Principal Components method to get an idea of the shape of the data -
>>> are
>>> there serious outliers, etc. and then move to classification methods -
>>> supervised and unsupervised, binary and multiple. You're almost
>>> certainly
>>> going to have to deal with missing data .
>>
>> A while back we were thinking about introducing a more advanced framework
>> for everyone who gets bored playing with spreadsheets ;) We were debating
>> on R vs. Python (Although I'm a python programmer I did most of my data
>> work in R (pandas didn't exist when I started out)). Would you want to
>> write a short introduction on python/pandas. What you need to start out
>> and
>> where to find further resources?
>>
>> Michael
>>
>> --
>> Data Diva | skype: mihi_tr | @mihi_tr
>> Open Knowledge | School of Data
>> http://okfn.org | http://schoolofdata.org
>> GPG/PGP key: http://tentacleriot.eu/mihi.asc
>> _______________________________________________
>> school-of-data mailing list
>> school-of-data at lists.okfn.org
>> https://lists.okfn.org/mailman/listinfo/school-of-data
>> Unsubscribe: https://lists.okfn.org/mailman/options/school-of-data
> 
> -- The Open University is incorporated by Royal Charter (RC 000391), an exempt charity in England & Wales and a charity registered in Scotland (SC 038302). The Open University is authorised and regulated by the Financial Conduct Authority.
> _______________________________________________
> school-of-data mailing list
> school-of-data at lists.okfn.org
> https://lists.okfn.org/mailman/listinfo/school-of-data
> Unsubscribe: https://lists.okfn.org/mailman/options/school-of-data
> 


-- 
Dr. Dan McQuillan
Lecturer in Creative & Social Computing, Goldsmiths, University of London
http://www.gold.ac.uk/computing/staff/d-mcquillan/



More information about the school-of-data mailing list