[School-of-data] Introduction -- Simon Cropper

Simon Cropper simoncropper at fossworkflowguides.com
Mon Jun 16 06:00:49 UTC 2014

Hello everyone,

This is an introduction to myself, my experience, my skills and my 

I am a scientist with 20+ years experience in pushing-and-pulling 
biological, financial and retail data (now colloquially called data 
wrangling, data munging and data analysis :-) ). I have programmed in 
both the Windows and Linux platforms, and used a variety of packages and 
software languages to wrangling the data I need into a usable format for 
further analysis. Currently looking for new opportunities (euphemism for 
out-of-work :-( ) and looking for ways to keep my brain active.

I have provided some links to my on-line profile and blog site.

I have interests in conservation, taxonomy/nomenclature, GIS, 
photography and food. I am an active coder as well.

Historically I used Visual Foxpro to conduct most of my 
wrangling/munging/analysis, leaning heavily on SQL and hand crafted 
algorithms to collate, group and summarize data. Since Microsoft 
withdrew support for Visual Foxpro, I have been on the hunt for tools to 
replace the data manipulation capabilities of this package -- at present 
I am actively teaching myself Python (doing alright at the moment; see 
my GitHub <https://github.com/SimonChristopherCropper/RT_ChooseProfile> 
repository) and the playing around with a variety of modules available 
to manage large datasets (e.g. pandas, petl). Although well versed in 
the use of a variety of relational databases and a variety of 
programming languages, I am also good at using Excel and other 
spreadsheets to conduct work on smaller datasets.

In the last few years I have also been actively involved in using Open 
Source GIS to conduct geospatial analysis (e.g. geocoding datasets). 
Although teething on gvSIG I am also interested in QGIS due to its 
integration with Python.

Although I have a biological bent in my career and years of knowledge on 
datasets in Australia, I have a variety of interests. One that I am 
interested in exploring is food. I have located a large dataset of food 
available in Australia and it's composition (e.g. protein, carbs, 
vitamins, fat, etc) -- it is available under a CC-BY-SA license. I am 
interested in exploring this dataset in more detail. I have a range of 
questions that interests me but wonder if others might want to help 
tease out some other useful information from this dataset.

Some questions I am interested in attempting to answer...
Q. Of the 50 odd nutrients documented in this dataset what are the foods 
with the most and least amount present. For example, what foods are high 
in Calcium that could be used as suitable natural alternative to milk?
Q. If you compare examples of a variety of well known diets (paleo, 
atkins, mediteranian, wholefoods, vegan, vegetarian), how do they vary? 
What nutritional components differ?

As I am exploring some new tools in Python, I have thought of doing this 
analysis using Pandas or something similar. The code would be integrated 
into iPython Notebooks so others could view the methodology and augment 
where necessary, and managed in a GitHub repository.

Bye-and-bye, I have been an active exponent of open source software, 
open content and open data. I am very interested in copyright and 
copyleft licensing as a means of ensuring information is readily 
available to the broader public.

Cheers Simon

    Simon Cropper - Open Content Creator
    W: http://www.simonchristophercropper.com
    Free and Open Source Software Workflow Guides
    Introduction               http://www.fossworkflowguides.com
    GIS Packages           http://www.fossworkflowguides.com/gis
    bash / Python    http://www.fossworkflowguides.com/scripting

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.okfn.org/pipermail/school-of-data/attachments/20140616/af70d04f/attachment.html>

More information about the school-of-data mailing list