behavior of is_total_slice for boundary chunks

The is_total_slice function is used to determine whether a selection corresponds to an entire chunk (is_total_slice -> True) or not (is_total_slice -> False).

I noticed that is_total_slice is always False for "partial chunks", i.e. chunks that are not filled by the array:

import zarr
from zarr.util import is_total_slice
# create an array with 1 full chunk and 1 partial chunk
a = zarr.open('test.zarr', path='test', shape=(10,), chunks=(9,), dtype='uint8', mode='w')
for x in BasicIndexer(slice(None), a):
    print(x.chunk_selection, is_total_slice(x.chunk_selection, a._chunks))

Which prints this:

(slice(0<
547B
/span>, 9, 1),) True
(slice(0, 1, 1),) False

Although the last selection is not the size of a full chunk, it is "total" with respect to the output of that selection in the array.
A direct consequence of this behavior is unnecessary chunk loading when performing array assignments -- zarr uses the result of is_total_slice to decide whether to load existing chunk data or not. Because is_total_slice is always False for partial chunks, zarr always loads boundary chunks during assignment.

A solution to this would be to augment the is_total_slice function to account for partial chunks. I'm not sure at the moment how to do this exactly, but it shouldn't be hard. Happy to bring forth a PR if people agree that this is an issue worth addressing.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions