Draft: gh-123471: Make concurrent iteration over itertools.pairwise safe under free-threading #125417
+72
−9
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
We make concurrent iteration over
itertools.pairwise
safe by:_PyObject_IsUniquelyReferenced
for re-use of the result typepo->it
instead of a borrowed referencepo->old
early.Also see #123848
To make
pairwise_next
safe we need to prevent concurrent iterators from accessing the clearedpo->it
pairwise_next
replacePyObject *it = po->it; with PyObject *it = Py_XInRef(po->it);
.Then every thread will have a true reference to the iterator, and not a borrowed refe 8000 rence.
Simple, but overhead is an incref/decref in the common case
po->it
. We then need another way to signal that the iterator has been exhausted.Use another variable "int iterator_exhausted" in the pairwise object. Requires a bit more memory
for each pairwise object, but the check is a simple int comparison.
po->it
. We then need another way to signal that the iterator has been exhausted.Use
po->old
to set a sentinel. Requires a global sentinel in the module (or a sentinel per thread).In the top of
pairwise_next
we would haveAn issue with this approach is that at the end of pairwise_next there is the line:
Py_XSETREF(po->old, new);
So one thread can set the sentinel, but in another thread the value could be overwritten.
Maybe this is ok: this will not cause crashes, and we do not have to guarantee correct behaviour for
the free-threading build.
This approach does not require additional increfs/decrefs, but it does require setting up the sentinel.
In this PR we work out the first option.