8000 Iterate over an Excel file (.xls/.xlsx) without loading the data into the memory · Issue #11064 · pandas-dev/pandas · GitHub
[go: up one dir, main page]

Skip to content
Iterate over an Excel file (.xls/.xlsx) without loading the data into the memory #11064
@denizbilgili

Description

@denizbilgili

This is more of a request rather than a call for help I believe, as I think there isn't a way pandas can currently help me with this. Here's my stackoverflow thread relating the issue. http://stackoverflow.com/questions/32523083/iterator-to-iterate-over-excel-file-in-python

I work with large excel files which I use as databases. Most of the time, I need the raw data from the database only once but with read_excel() method, I have to load it all into my memory even though I'll need that data only once. Sad thing is, the amount of time I take for calculations and data arrangement is far shorter than the time my script takes to read excel data.

It would be pretty nice if pandas had a feature which would allow us to have iterator objects that iterate over certain row/column ranges. As I have stated in my stackoverflow thread, people would be able to make their own iterators if pandas had a method to return a single cell value. Or you could just pass a range of columns or rows to it and it could return a generator.

If these are already all there or there are methods for this kind of applications, or anything that can help me deal with large excel files, I would like to know. Thank you.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions

      0