[okfn-labs] Python Iterator over table (csv) *columns*

Edgar Zanella Alvarenga e at vaz.io
Wed Dec 17 14:52:57 UTC 2014

You can use read_csv from Pandas:


usecols : array-like

     Return a subset of the columns. Results in much faster parsing time 
and lower memory usage.

and pass the columns to the `usecols` argument. If you have a problem 
with the size of
the csv file you can read it in chunks with:

pandas.read_csv(filepath, sep = DELIMITER,skiprows = 
INITIAL_LINES_TO_SKIP, chunksize = 10000)

and change the value INITIAL_LINES_TO_SKIP in your iteration.


On 17/12/2014 05:23, Paul Walsh wrote:
> Hi,
> Does anyone have or know of a nice (existing) solution for iterative
> reading of CSV/table data by *column*? It needs to be an iterator - I
> don’t want everything in memory.
> _______________________________________________
> okfn-labs mailing list
> okfn-labs at lists.okfn.org
> https://lists.okfn.org/mailman/listinfo/okfn-labs
> Unsubscribe: https://lists.okfn.org/mailman/options/okfn-labs

More information about the okfn-labs mailing list