8000 [ZipStore] Unable to read multiple Zarr ZipStore files · Issue #2752 · zarr-developers/zarr-python · GitHub
[go: up one dir, main page]

Skip to content
[ZipStore] Unable to read multiple Zarr ZipStore files #2752
Closed
@zhiweigan

Description

@zhiweigan

Zarr version

v3.0.1

Numcodecs version

v0.15.0

Python Version

3.12.8

Operating System

Linux

Installation

using pip into fresh virtual environment

Description

I was trying to migrate an internal tool from Zarr 2 to Zarr 3, but ran into an issue with reading from different ZipStore files in a multi-processing context. When reading several files using a ProcessPoolExecutor, it would stall (it is possibly a deadlock) when reading the files. However, the same process does not stall using a ThreadPoolExecutor.

Adapting to Zarr 2 syntax, the same code succeeds with no issue.

Steps to reproduce

  1. Activate venv
  2. Install zarr
from zarr.storage import ZipStore
import zarr

for i in range(3):
    with ZipStore(f"test{i}.zip", mode="w") as store:
        zarr.create_array(store, shape=(2,), dtype="float64")
print("Written Stores")

from concurrent.futures import ProcessPoolExecutor, ThreadPoolExecutor

print("Opening Stores")
with ProcessPoolExecutor() as executor:
    futures = [
        executor.submit(zarr.open_array, ZipStore(f"test{i}.zip", mode="r"), mode="r")
        for i in range(3)
    ]
    datasets = [future.result() for future in futures]
print("Opened Stores") # Prints with ThreadPoolExecutor but not ProcessPoolExecutor

Additional output

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugPotential issues with the zarr-python libraryhelp wantedIssue could use help from someone with familiarity on the topic

    Type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions

      0