8000 reedit parallel section · rjgildea/zarr-python@2ec84e4 · GitHub
[go: up one dir, main page]

Skip to content

Commit 2ec84e4

Browse files
committed
reedit parallel section
1 parent 323b7dc commit 2ec84e4

File tree

1 file changed

+23
-28
lines changed

1 file changed

+23
-28
lines changed

docs/tutorial.rst

Lines changed: 23 additions & 28 deletions
Original file line numberDiff line numberDiff line change
@@ -927,34 +927,24 @@ filters (e.g., byte-shuffle) have been applied.
927927
Parallel computing and synchronization
928928
--------------------------------------
929929

930-
Zarr arrays have been designed for use as the source and/or sink for data in
931-
parallel computations. Please note that this is an area of ongoing research and
932-
development. If you are using Zarr for parallel computing, we welcome feedback,
933-
experience, discussion, ideas and advice, particularly about issues such as data
934-
integrity and performance.
935-
936-
Both multi-threaded and multi-process parallelism are possible, although Zarr
937-
can use a number of different storage systems (see :ref:`tutorial_storage`) and
938-
not all storage systems support both types of parallelism. Please see the API
939-
docs for the :mod:`zarr.storage` module for more information about which storage
940-
classes support parellel computing.
941-
942-
The bottleneck for most storage and retrieval operations is
943-
compression/decompression, and the Python global interpreter lock (GIL) is
944-
released wherever possible during these operations, so Zarr will generally not
945-
block other Python threads from running.
946-
947-
Depending on how data are being accessed or updated, some synchronization
948-
(locking) may be required to avoid data loss. If an array is being read
949-
concurrently by multiple threads or processes, no synchronization is
950-
required. If an array is being written to concurrently by multiple threads or
951-
processes, some synchronization may be required, depending on the way the data
952-
is being written.
953-
954-
If each worker in a parallel computation is writing to a separate region of the
955-
array, and if region boundaries are perfectly aligned with chunk boundaries,
956-
then no synchronization is required. However, if region and chunk boundaries are
957-
not perfectly aligned, then synchronization is required to avoid two workers
930+
Zarr arrays have been designed for use as the source or sink for data in
931+
parallel computations. By data source we mean that multiple concurrent read
932+
operations may occur. By data sink we mean that multiple concurrent write
933+
operations may occur, with each writer updating a different region of the
934+
array. Zarr arrays have **not** been designed for situations where multiple
935+
readers and writers are concurrently operating on the same array.
936+
937+
Both multi-threaded and multi-process parallelism are possible. The bottleneck
938+
for most storage and retrieval operations is compression/decompression, and the
939+
Python global interpreter lock (GIL) is released wherever possible during these
940+
operations, so Zarr will generally not block other Python threads B267 from running.
941+
942+
When using a Zarr array as a data sink, some synchronization (locking) may be
943+
required to avoid data loss, depending on how data are being updated. If each
944+
worker in a parallel computation is writing to a separate region of the array,
945+
and if region boundaries are perfectly aligned with chunk boundaries, then no
946+
synchronization is required. However, if region and chunk boundaries are not
947+
perfectly aligned, then synchronization is required to avoid two workers
958948
attempting to modify the same chunk at the same time, which could result in data
959949
loss.
960950

@@ -991,6 +981,11 @@ some networked file systems). E.g.::
991981

992982
This array is safe to read or write from multiple processes.
993983

984+
Please note that support for parallel computing is an area of ongoing research
985+
and development. If you are using Zarr for parallel computing, we welcome
986+
feedback, experience, discussion, ideas and advice, particularly about issues
987+
related to data integrity and performance.
988+
994989
.. _tutorial_pickle:
995990

996991
Pickle support

0 commit comments

Comments
 (0)
0