[School-of-data] Introduction -- Simon Cropper
simoncropper at fossworkflowguides.com
Mon Jun 16 06:00:49 UTC 2014
This is an introduction to myself, my experience, my skills and my
I am a scientist with 20+ years experience in pushing-and-pulling
biological, financial and retail data (now colloquially called data
wrangling, data munging and data analysis :-) ). I have programmed in
both the Windows and Linux platforms, and used a variety of packages and
software languages to wrangling the data I need into a usable format for
further analysis. Currently looking for new opportunities (euphemism for
out-of-work :-( ) and looking for ways to keep my brain active.
I have provided some links to my on-line profile and blog site.
I have interests in conservation, taxonomy/nomenclature, GIS,
photography and food. I am an active coder as well.
Historically I used Visual Foxpro to conduct most of my
wrangling/munging/analysis, leaning heavily on SQL and hand crafted
algorithms to collate, group and summarize data. Since Microsoft
withdrew support for Visual Foxpro, I have been on the hunt for tools to
replace the data manipulation capabilities of this package -- at present
I am actively teaching myself Python (doing alright at the moment; see
my GitHub <https://github.com/SimonChristopherCropper/RT_ChooseProfile>
repository) and the playing around with a variety of modules available
to manage large datasets (e.g. pandas, petl). Although well versed in
the use of a variety of relational databases and a variety of
programming languages, I am also good at using Excel and other
spreadsheets to conduct work on smaller datasets.
In the last few years I have also been actively involved in using Open
Source GIS to conduct geospatial analysis (e.g. geocoding datasets).
Although teething on gvSIG I am also interested in QGIS due to its
integration with Python.
Although I have a biological bent in my career and years of knowledge on
datasets in Australia, I have a variety of interests. One that I am
interested in exploring is food. I have located a large dataset of food
available in Australia and it's composition (e.g. protein, carbs,
vitamins, fat, etc) -- it is available under a CC-BY-SA license. I am
interested in exploring this dataset in more detail. I have a range of
questions that interests me but wonder if others might want to help
tease out some other useful information from this dataset.
Some questions I am interested in attempting to answer...
Q. Of the 50 odd nutrients documented in this dataset what are the foods
with the most and least amount present. For example, what foods are high
in Calcium that could be used as suitable natural alternative to milk?
Q. If you compare examples of a variety of well known diets (paleo,
atkins, mediteranian, wholefoods, vegan, vegetarian), how do they vary?
What nutritional components differ?
As I am exploring some new tools in Python, I have thought of doing this
analysis using Pandas or something similar. The code would be integrated
into iPython Notebooks so others could view the methodology and augment
where necessary, and managed in a GitHub repository.
Bye-and-bye, I have been an active exponent of open source software,
open content and open data. I am very interested in copyright and
copyleft licensing as a means of ensuring information is readily
available to the broader public.
Simon Cropper - Open Content Creator
Free and Open Source Software Workflow Guides
GIS Packages http://www.fossworkflowguides.com/gis
bash / Python http://www.fossworkflowguides.com/scripting
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the school-of-data