[Os-datawrangling] NYC data file (Code for DC)

Anders Pedersen anders.pedersen at okfn.org
Tue Aug 27 13:14:31 UTC 2013


Hi CJ,

Thanks a lot for getting back on this. Great to hear that you'll be working
on these data. The OpenSpending data wrangling list might be of help on
some of this, so feel free to join:
http://lists.okfn.org/mailman/listinfo/os-datawrangling

Notes below:

Best,
Anders

---------- Forwarded message ----------
From: CJ Gehin-Scott <cjgehinscott at me.com>
Date: 26 August 2013 15:07
Subject: NYC data file (Code for DC)
To: Anders Pedersen <anders.pedersen at okfn.org>


Anders,

How is it going?  I got the NYC Spending data file loaded into Sequel Pro
and it is in a much more manageable format.  I know you were talking about
a unique identifier at the meet up and I was wondering if you meant for
each data point or for a particular transaction type

Awesome. Each row in the dataset should have its own unique identifier e.g.
a number.

Where should I go from here? Should I start sifting through the data for
errors and quality or is that something that will be done after it is
posted to openspending?

It is important that no rows in the dataset is left blank. Given the size
of the dataset I would actually suggest you to slice up for the data load:
for example into quarterly or monthly chunks. This will help you avoid to
deal with big loads that might get stuck on a single error-row.

I hope all is well and that you had a great weekend!

Thanks, glad you're staying on this project!

Regards,
CJ



-- 
*

Anders Pedersen

Community Coordinator  |  skype: anpehej  |  @anpe <https://twitter.com/>

The Open Knowledge Foundation <http://okfn.org/>

Empowering through Open Knowledge

http://okfn.org/  |  @okfn <http://twitter.com/OKFN>  |  OKF on
Facebook<https://www.facebook.com/OKFNetwork> |
Blog <http://blog.okfn.org/>  |  Newsletter<http://okfn.org/about/newsletter>

*

OpenSpending | http://openspending.org |
@openspending<http://twitter.com/openspending>

School of Data | http://schoolofdata.org |
@schoolofdata<http://twitter.com/schoolofdata>


*

**

*
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.okfn.org/pipermail/os-datawrangling/attachments/20130827/2103f800/attachment.html>


More information about the os-datawrangling mailing list