8000 behavior of `is_total_slice` for boundary chunks · Issue #757 · zarr-developers/zarr-python · GitHub
[go: up one dir, main page]

Skip to content
behavior of is_total_slice for boundary chunks #757
Open
@d-v-b

Description

@d-v-b

The is_total_slice function is used to determine whether a selection corresponds to an entire chunk (is_total_slice -> True) or not (is_total_slice -> False).

I noticed that is_total_slice is always False for "partial chunks", i.e. chunks that are not filled by the array:

import zarr
from zarr.util import is_total_slice
# create an array with 1 full chunk and 1 partial chunk
a = zarr.open('test.zarr', path='test', shape=(10,), chunks=(9,), dtype='uint8', mode='w')
for x in BasicIndexer(slice(None), a):
    print(x.chunk_selection, is_total_slice(x.chunk_selection, a._chunks))

Which prints this:

(slice(0<
547B
/span>, 9, 1),) True
(slice(0, 1, 1),) False

Although the last selection is not the size of a full chunk, it is "total" with respect to the output of that selection in the array.
A direct consequence of this behavior is unnecessary chunk loading when performing array assignments -- zarr uses the result of is_total_slice to decide whether to load existing chunk data or not. Because is_total_slice is always False for partial chunks, zarr always loads boundary chunks during assignment.

A solution to this would be to augment the is_total_slice function to account for partial chunks. I'm not sure at the moment how to do this exactly, but it shouldn't be hard. Happy to bring forth a PR if people agree that this is an issue worth addressing.

Metadata

Metadata

Assignees

No one assigned

    Labels

    V2Affects the v2 branchbugPotential issues with the zarr-python library

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions

      0