[okfn-labs] Python Iterator over table (csv) *columns*

Paul Walsh paulywalsh at gmail.com
Wed Dec 17 09:35:12 UTC 2014


Sure. JSON Table Schema validation and reporting tools. I’m simply looking at ways run over data without loading it all in memory. It could be that for certain checks I’ll have to anyway.


> On 17 Dec 2014, at 11:25, Tarek Amr <tarekamr at gmail.com> wrote:
> 
> If I may ask, what is the main task you want to achieve, may be we can tailor a workaround based on it. 
> 
> On Wed, Dec 17, 2014 at 10:22 AM, Friedrich Lindenberg <friedrich.lindenberg at okfn.org <mailto:friedrich.lindenberg at okfn.org>> wrote:
> So you're looking for something that has better performance than reading the whole file N (number of cols) times? That seems hard. The only thing you might be able to do is cache the line lengths, so you can read without seeking newlines and end-quotes after the first run... 
> 
> - Friedrich 
> 
> On Wed, Dec 17, 2014 at 10:17 AM, Paul Walsh <paulywalsh at gmail.com <mailto:paulywalsh at gmail.com>> wrote:
> Yes I know, I’m looking for some magic, or at least some possible approaches that anyone may have used in some context.
> 
>> On 17 Dec 2014, at 10:45, Tarek Amr <tarekamr at gmail.com <mailto:tarekamr at gmail.com>> wrote:
>> 
>> I do not think this is possible, there is no way to tell beforehand the indices for each new line without reading the whole file in memory to search for '\n's, also cells in the case of CSV are not of a fixed size, so no way build indices for cells without reading the whole line in memory looking for separators. 
>> 
>> Nevertheless, may be there is some magical solution out there that I don't know.
>> 
>> 
>> On Wed, Dec 17, 2014 at 8:23 AM, Paul Walsh <paulywalsh at gmail.com <mailto:paulywalsh at gmail.com>> wrote:
>> Hi,
>> 
>> Does anyone have or know of a nice (existing) solution for iterative reading of CSV/table data by *column*? It needs to be an iterator - I don’t want everything in memory.
>> _______________________________________________
>> okfn-labs mailing list
>> okfn-labs at lists.okfn.org <mailto:okfn-labs at lists.okfn.org>
>> https://lists.okfn.org/mailman/listinfo/okfn-labs <https://lists.okfn.org/mailman/listinfo/okfn-labs>
>> Unsubscribe: https://lists.okfn.org/mailman/options/okfn-labs <https://lists.okfn.org/mailman/options/okfn-labs>
>> 
>> 
>> -- 
>> Best Regards
>> Tarek Amr
>> 
>> http://tarekamr.appspot.com/ <http://tarekamr.appspot.com/>
>> 
> 
> 
> _______________________________________________
> okfn-labs mailing list
> okfn-labs at lists.okfn.org <mailto:okfn-labs at lists.okfn.org>
> https://lists.okfn.org/mailman/listinfo/okfn-labs <https://lists.okfn.org/mailman/listinfo/okfn-labs>
> Unsubscribe: https://lists.okfn.org/mailman/options/okfn-labs <https://lists.okfn.org/mailman/options/okfn-labs>
> 
> 
> 
> -- 
> Best Regards
> Tarek Amr
> 
> http://tarekamr.appspot.com/ <http://tarekamr.appspot.com/>
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.okfn.org/pipermail/okfn-labs/attachments/20141217/631195b6/attachment-0004.html>


More information about the okfn-labs mailing list