[ddj] SQL Vs Excel Vs Refine

Joe Germuska joe at germuska.com
Mon Apr 29 15:21:57 UTC 2013


I like all of Sharon's answers quite a bit. Especially "it was designed for subsetting, slicing and dicing" and "easier to go back and check my work"

But I sometimes wonder: when people ask "should I learn SQL?" aren't they usually asking "is SQL really worth all the arcana of installing MySQL or Postgres?" Unfortunately, it is pretty arcane, although the MAMP/WAMP package seems to be a pretty good way to get going.  

I'm sure people will chime in with other favorite installers, packages, GUI admin tools and the like, but I'm afraid that the plethora of responses is just going to reinforce for many journalists the basic problem—it can quickly become its own adventure. A great adventure, like learning to cook food from scratch instead of from kits and convenience packages, but an adventure nevertheless…

Joe



On Apr 29, 2013, at 7:46 AM, SMachlis at computerworld.com wrote:

> What I like about SQL:
> 
> = It was *designed* for subsetting, slicing and dicing data. Yes, I can do this to a large degree with Excel and Google Refine; but with a more complex project -- especially as others have pointed out, with data having one or more relationships between multiple tables -- there are times that I find that using a tool designed for the job to be less frustrating and considerably more robust.
> 
> = If I am dealing with a large data set that is already in multiple tables, SQL makes more sense to be than trying to shoehorn that data into an Excel-friendly format.
> 
> = It helps me think about data in a more structured way, which is very useful when I've got projects where I'm collecting and storing my own data.
> 
> = It helps me understand what sorts of data I can and can't reasonably request from government agencies that store their data in structured databases.
> 
> = If I am sharing data with colleagues, sometimes it's useful to be able to put up a simple PHP/MySQL app on our intranet (Rails or Jango might be a better choice for this, but the shared internal server I have access to does not include those platforms). Even if I'm creating a Web application with a third-party service such as Caspio, I find it helpful to be able to think about data in relational terms.
> 
> = Having a series of SQL commands I can store in a file makes it easier for me or others to go back and check my work, versus a series of Excel point-and-click operations (or even multiple macros buried in Excel).
> 
> Sharon Machlis
> 
> ________________________________________
> From: data-driven-journalism-bounces at lists.okfn.org [data-driven-journalism-bounces at lists.okfn.org] On Behalf Of Andrew Duffy [andrewjamesduffy at gmail.com]
> Sent: Monday, April 29, 2013 12:37 AM
> To: data-driven-journalism at lists.okfn.org
> Subject: [ddj] SQL Vs Excel Vs Refine
> 
> Question:
> 
> Are there any data journalists/devs out there that can advise as to whether it's worth learning SQL? So far a combination of Excel/Google Refine has been more than enough for dumping, organising, and cleaning my data projects, but I have only worked with spreadsheets up to ~500 rows.
> 
> What can SQL do that refine/excel can't?
> 
> --
> 
> Andrew Duffy - Journalist
> 
> 
> 
> _______________________________________________
> data-driven-journalism mailing list
> data-driven-journalism at lists.okfn.org
> http://lists.okfn.org/mailman/listinfo/data-driven-journalism
> Unsubscribe: http://lists.okfn.org/mailman/options/data-driven-journalism

-- 
Joe Germuska
Joe at Germuska.com * http://blog.germuska.com * http://twitter.com/JoeGermuska    

"Science's job is to map our ignorance." --David Byrne





More information about the data-driven-journalism mailing list