-
-
Notifications
You must be signed in to change notification settings - Fork 346
Labels
bugPotential issues with the zarr-python libraryPotential issues with the zarr-python libraryhelp wantedIssue could use help from someone with familiarity on the topicIssue could use help from someone with familiarity on the topic
Description
Zarr version
v3.0.1
Numcodecs version
v0.15.0
Python Version
3.12.8
Operating System
Linux
Installation
using pip into fresh virtual environment
Description
I was trying to migrate an internal tool from Zarr 2 to Zarr 3, but ran into an issue with reading from different ZipStore files in a multi-processing context. When reading several files using a ProcessPoolExecutor, it would stall (it is possibly a deadlock) when reading the files. However, the same process does not stall using a ThreadPoolExecutor.
Adapting to Zarr 2 syntax, the same code succeeds with no issue.
Steps to reproduce
- Activate venv
- Install zarr
from zarr.storage import ZipStore
import zarr
for i in range(3):
with ZipStore(f"test{i}.zip", mode="w") as store:
zarr.create_array(store, shape=(2,), dtype="float64")
print("Written Stores")
from concurrent.futures import ProcessPoolExecutor, ThreadPoolExecutor
print("Opening Stores")
with ProcessPoolExecutor() as executor:
futures = [
executor.submit(zarr.open_array, ZipStore(f"test{i}.zip", mode="r"), mode="r")
for i in range(3)
]
datasets = [future.result() for future in futures]
print("Opened Stores") # Prints with ThreadPoolExecutor but not ProcessPoolExecutor
Additional output
No response
PierandreaCancian
Metadata
Metadata
Assignees
Labels
bugPotential issues with the zarr-python libraryPotential issues with the zarr-python libraryhelp wantedIssue could use help from someone with familiarity on the topicIssue could use help from someone with familiarity on the topic