8000 ValueError raised when opening a 'plain' zarr with xarray+zarr 3 · Issue #9970 · pydata/xarray · GitHub
[go: up one dir, main page]

Skip to content
ValueError raised when opening a 'plain' zarr with xarray+zarr 3 #9970
@oliverwm1

Description

@oliverwm1

What happened?

This report is specific to zarr 3. When using xarray to open a dataset that was written by zarr-python (i.e. one that is missing xarray's required dimension names metadata), zarr 3 raises a ValueError and gives a cryptic error message. In prior versions of zarr-python, a KeyError was raised and the error message was much more informative.

What did you expect to happen?

Raise a KeyError and give a more useful error message.

Minimal Complete Verifiable Example

import xarray
import zarr
import numpy

path = 'foo.zarr'

z = zarr.open_group(path)
arr = z.create_array('bar', shape=(3,5), dtype=numpy.float32)
arr[:] = 1.0

xarray.open_zarr(path)

MVCE confirmation

  • Minimal example — the example is as focused as reasonably possible to demonstrate the underlying issue in xarray.
  • Complete example — the example is self-contained, including all data and the text of any traceback.
  • Verifiable example — the example copy & pastes into an IPython prompt or Binder notebook, returning the result.
  • New issue — a search of GitHub Issues suggests this is not a duplicate.
  • Recent environment — the issue occurs with the latest version of xarray and its dependencies.

Relevant log output

Traceback (most recent call last):
  File "/Users/oliverwm/xarray-opening-zarr.py", line 11, in <module>
    xarray.open_zarr(path)
  File "/opt/homebrew/Caskroom/miniconda/base/envs/test-xr-zr/lib/python3.11/site-packages/xarray/backends/zarr.py", line 1491, in open_zarr
    ds = open_dataset(
         ^^^^^^^^^^^^^
  File "/opt/homebrew/Caskroom/miniconda/base/envs/test-xr-zr/lib/python3.11/site-packages/xarray/backends/api.py", line 679, in open_dataset
    backend_ds = backend.open_dataset(
                 ^^^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/Caskroom/miniconda/base/envs/test-xr-zr/lib/python3.11/site-packages/xarray/backends/zarr.py", line 1581, in open_dataset
    ds = store_entrypoint.open_dataset(
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/Caskroom/miniconda/base/envs/test-xr-zr/lib/python3.11/site-packages/xarray/backends/store.py", line 44, in open_dataset
    vars, attrs = filename_or_obj.load()
                  ^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/Caskroom/miniconda/base/envs/test-xr-zr/lib/python3.11/site-packages/xarray/backends/common.py", line 312, in load
    (_decode_variable_name(k), v) for k, v in self.get_variables().items()
                                              ^^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/Caskroom/miniconda/base/envs/test-xr-zr/lib/python3.11/site-packages/xarray/backends/zarr.py", line 858, in get_variables
    return FrozenDict((k, self.open_store_variable(k)) for k in self.array_keys())
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/Caskroom/miniconda/base/envs/test-xr-zr/lib/python3.11/site-packages/xarray/core/utils.py", line 415, in FrozenDict
    return Frozen(dict(*args, **kwargs))
                  ^^^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/Caskroom/miniconda/base/envs/test-xr-zr/lib/python3.11/site-packages/xarray/backends/zarr.py", line 858, in <genexpr>
    return FrozenDict((k, self.open_store_variable(k)) for k in self.array_keys())
                          ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/Caskroom/miniconda/base/envs/test-xr-zr/lib/python3.11/site-packages/xarray/backends/zarr.py", line 824, in open_store_variable
    "preferred_chunks": dict(zip(dimensions, zarr_array.chunks, strict=True)),
                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ValueError: zip() argument 2 is longer than argument 1

Anything else we need to know?

If using zarr==2.18.4 a KeyError is raised and the error message is more useful:

Traceback (most recent call last):
  File "/opt/homebrew/Caskroom/miniconda/base/envs/test-xr-zr/lib/python3.11/site-packages/xarray/backends/zarr.py", line 390, in _get_zarr_dims_and_attrs
    dimensions = zarr_obj.attrs[dimensio
715D
n_key]
                 ~~~~~~~~~~~~~~^^^^^^^^^^^^^^^
  File "/opt/homebrew/Caskroom/miniconda/base/envs/test-xr-zr/lib/python3.11/site-packages/zarr/attrs.py", line 74, in __getitem__
    return self.asdict()[item]
           ~~~~~~~~~~~~~^^^^^^
KeyError: '_ARRAY_DIMENSIONS'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/opt/homebrew/Caskroom/miniconda/base/envs/test-xr-zr/lib/python3.11/site-packages/xarray/backends/zarr.py", line 404, in _get_zarr_dims_and_attrs
    os.path.basename(dim) for dim in zarray["_NCZARR_ARRAY"]["dimrefs"]
                                     ~~~~~~^^^^^^^^^^^^^^^^^
KeyError: '_NCZARR_ARRAY'

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/Users/oliverwm/xarray-opening-zarr.py", line 11, in <module>
    xarray.open_zarr(path)
  File "/opt/homebrew/Caskroom/miniconda/base/envs/test-xr-zr/lib/python3.11/site-packages/xarray/backends/zarr.py", line 1491, in open_zarr
    ds = open_dataset(
         ^^^^^^^^^^^^^
  File "/opt/homebrew/Caskroom/miniconda/base/envs/test-xr-zr/lib/python3.11/site-packages/xarray/backends/api.py", line 679, in open_dataset
    backend_ds = backend.open_dataset(
                 ^^^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/Caskroom/miniconda/base/envs/test-xr-zr/lib/python3.11/site-packages/xarray/backends/zarr.py", line 1581, in open_dataset
    ds = store_entrypoint.open_dataset(
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/Caskroom/miniconda/base/envs/test-xr-zr/lib/python3.11/site-packages/xarray/backends/store.py", line 44, in open_dataset
    vars, attrs = filename_or_obj.load()
                  ^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/Caskroom/miniconda/base/envs/test-xr-zr/lib/python3.11/site-packages/xarray/backends/common.py", line 312, in load
    (_decode_variable_name(k), v) for k, v in self.get_variables().items()
                                              ^^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/Caskroom/miniconda/base/envs/test-xr-zr/lib/python3.11/site-packages/xarray/backends/zarr.py", line 858, in get_variables
    return FrozenDict((k, self.open_store_variable(k)) for k in self.array_keys())
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/Caskroom/miniconda/base/envs/test-xr-zr/lib/python3.11/site-packages/xarray/core/utils.py", line 415, in FrozenDict
    return Frozen(dict(*args, **kwargs))
                  ^^^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/Caskroom/miniconda/base/envs/test-xr-zr/lib/python3.11/site-packages/xarray/backends/zarr.py", line 858, in <genexpr>
    return FrozenDict((k, self.open_store_variable(k)) for k in self.array_keys())
                          ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/Caskroom/miniconda/base/envs/test-xr-zr/lib/python3.11/site-packages/xarray/backends/zarr.py", line 817, in open_store_variable
    dimensions, attributes = _get_zarr_dims_and_attrs(
                             ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/Caskroom/miniconda/base/envs/test-xr-zr/lib/python3.11/site-packages/xarray/backends/zarr.py", line 407, in _get_zarr_dims_and_attrs
    raise KeyError(
KeyError: 'Zarr object is missing the attribute `_ARRAY_DIMENSIONS` and the NCZarr metadata, which are required for xarray to determine variable dimensions.'

Environment

INSTALLED VERSIONS

commit: None
python: 3.11.11 (main, Dec 11 2024, 10:25:04) [Clang 14.0.6 ]
python-bits: 64
OS: Darwin
OS-release: 23.4.0
machine: arm64
processor: arm
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: ('en_US', 'UTF-8')
libhdf5: None
libnetcdf: None

xarray: 2025.1.1
pandas: 2.2.3
numpy: 2.2.2
scipy: None
netCDF4: None
pydap: None
h5netcdf: None
h5py: None
zarr: 3.0.1
cftime: None
nc_time_axis: None
iris: None
bottleneck: None
dask: None
distributed: None
matplotlib: None
cartopy: None
seaborn: None
numbagg: None
fsspec: None
cupy: None
pint: None
sparse: None
flox: None
numpy_groupies: None
setuptools: 75.1.0
pip: 24.2
conda: None
pytest: None
mypy: None
IPython: None
sphinx: None

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugtopic-zarrRelated to zarr storage library

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions

      0