Use a buffer pools for reads from disk + in-memory shuffling

### Description of feature

In theory, it should be possible to pre-allocate and then reuse memory for

1. io operations (i.e., read from disk per anndata).  For sparse data, this could prove challenging (although I’d guess not impossible) because of the uneven nature of `data` and `indices` reads - certainly upper bounds derived from `indptr` should be doable on how much memory needs to be preallocated
2. then also for in-memory shuffle (i.e., we concatenate the read-from-disk data directly into preallocated buffers, shuffle the data that was put into that buffer, yield, repeat after next read.  this suffers from the aame above problem around uneven buffers for sparse matrices)

Handling leftover data for the second buffwr (i.e., cocnat buffer) might make this challenging, but it’s also possible we can at least upper bound the needed memory and then track how much and where the needed data is stored.

The benefit here would of course be not having to spend time allocating memory. Is this actually a bottleneck though? Maybe, maybe not.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use a buffer pools for reads from disk + in-memory shuffling #105

Description of feature

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Use a buffer pools for reads from disk + in-memory shuffling #105

Description

Description of feature

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions