Description
Zarr version
v3.0.1
Numcodecs version
v0.15.0
Python Version
3.12.8
Operating System
Linux
Installation
using pip into fresh virtual environment
Description
I was trying to migrate an internal tool from Zarr 2 to Zarr 3, but ran into an issue with reading from different ZipStore files in a multi-processing context. When reading several files using a ProcessPoolExecutor, it would stall (it is possibly a deadlock) when reading the files. However, the same process does not stall using a ThreadPoolExecutor.
Adapting to Zarr 2 syntax, the same code succeeds with no issue.
Steps to reproduce
- Activate venv
- Install zarr
from zarr.storage import ZipStore
import zarr
for i in range(3):
with ZipStore(f"test{i}.zip", mode="w") as store:
zarr.create_array(store, shape=(2,), dtype="float64")
print("Written Stores")
from concurrent.futures import ProcessPoolExecutor, ThreadPoolExecutor
print("Opening Stores")
with ProcessPoolExecutor() as executor:
futures = [
executor.submit(zarr.open_array, ZipStore(f"test{i}.zip", mode="r"), mode="r")
for i in range(3)
]
datasets = [future.result() for future in futures]
print("Opened Stores") # Prints with ThreadPoolExecutor but not ProcessPoolExecutor
Additional output
No response