[okfn-labs] Python Iterator over table (csv) *columns*
Edgar Zanella Alvarenga
e at vaz.io
Wed Dec 17 14:52:57 UTC 2014
You can use read_csv from Pandas:
http://pandas.pydata.org/pandas-docs/version/0.13.1/generated/pandas.io.parsers.read_csv.html
usecols : array-like
Return a subset of the columns. Results in much faster parsing time
and lower memory usage.
and pass the columns to the `usecols` argument. If you have a problem
with the size of
the csv file you can read it in chunks with:
pandas.read_csv(filepath, sep = DELIMITER,skiprows =
INITIAL_LINES_TO_SKIP, chunksize = 10000)
and change the value INITIAL_LINES_TO_SKIP in your iteration.
Edgar
On 17/12/2014 05:23, Paul Walsh wrote:
> Hi,
>
> Does anyone have or know of a nice (existing) solution for iterative
> reading of CSV/table data by *column*? It needs to be an iterator - I
> don’t want everything in memory.
> _______________________________________________
> okfn-labs mailing list
> okfn-labs at lists.okfn.org
> https://lists.okfn.org/mailman/listinfo/okfn-labs
> Unsubscribe: https://lists.okfn.org/mailman/options/okfn-labs
More information about the okfn-labs
mailing list