8000 GH-134453: Fix subprocess memoryview input handling on POSIX by gpshead · Pull Request #134949 · python/cpython · GitHub
[go: up one dir, main page]

Skip to content

GH-134453: Fix subprocess memoryview input handling on POSIX #134949

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 4 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Next Next commit
GH-134453: Fix subprocess memoryview input handling on POSIX
Fix inconsistent subprocess.Popen.communicate() behavior between Windows
and POSIX when using memoryview objects with non-byte elements as input.

On POSIX systems, the code was incorrectly comparing bytes written against
element count instead of byte count, causing data truncation for large
inputs with non-byte element types.

Changes:
- Cast memoryview inputs to byte view when input is already a memoryview
- Fix progress tracking to use len(input_view) instead of len(self._input)
- Add comprehensive test coverage for memoryview inputs

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>

pre-commit-whitespace-fixup
  • Loading branch information
gpshead committed May 30, 2025
commit c780ce054de3e619a022f927abf22d5dba69d74f
7 changes: 5 additions & 2 deletions Lib/subprocess.py
Original file line number Diff line number Diff line change
Expand Up @@ -2103,7 +2103,10 @@ def _communicate(self, input, endtime, orig_timeout):
self._save_input(input)

if self._input:
input_view = memoryview(self._input)
if not isinstance(self._input, memoryview):
input_view = memoryview(self._input)
else:
input_view = self._input.cast("b") # byte input required

with _PopenSelector() as selector:
if self.stdin and input:
Expand Down Expand Up @@ -2139,7 +2142,7 @@ def _communicate(self, input, endtime, orig_timeout):
selector.unregister(key.fileobj)
key.fileobj.close()
else:
if self._input_offset >= len(self._input):
if self._input_offset >= len(input_view):
selector.unregister(key.fileobj)
key.fileobj.close()
elif key.fileobj in (self.stdout, self.stderr):
Expand Down
41 changes: 41 additions & 0 deletions Lib/test/test_subprocess.py
Original file line number Diff line number Diff line change
Expand Up @@ -957,6 +957,47 @@ def test_communicate(self):
self.assertEqual(stdout, b"banana")
self.assertEqual(stderr, b"pineapple")

def test_communicate_memoryview_input(self):
# Test memoryview input with byte elements
test_data = b"Hello, memoryview!"
mv = memoryview(test_data)
p = subprocess.Popen([sys.executable, "-c",
'import sys; sys.stdout.write(sys.stdin.read())'],
stdin=subprocess.PIPE,
stdout=subprocess.PIPE)
self.addCleanup(p.stdout.close)
self.addCleanup(p.stdin.close)
(stdout, stderr) = p.communicate(mv)
self.assertEqual(stdout, test_data)
self.assertEqual(stderr, None)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use assertIsNone

Suggested change
self.assertEqual(stderr, None)
self.assertIsNone(stderr, None)


def test_communicate_memoryview_input_nonbyte(self):
# Test memoryview input with non-byte elements (e.g., int32)
# This tests the fix for gh-134453 where non-byte memoryviews
# had incorrect length tracking on POSIX
import array
# Create an array of 32-bit integers that's large enough to trigger
# the chunked writing behavior (> PIPE_BUF)
pipe_buf = getattr(select, 'PIPE_BUF', 512)
# Each 'i' element is 4 bytes, so we need more than pipe_buf/4 elements
# Add some extra to ensure we exceed the buffer size
num_elements = (pipe_buf // 4) + 100
test_array = array.array('i', range(num_elements)) # 'i' = signed int (4 bytes each)
Comment on lines +984 to +985
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, this test passes on the main branch. I think we want something like this:

Suggested change
num_elements = (pipe_buf // 4) + 100
test_array = array.array('i', range(num_elements)) # 'i' = signed int (4 bytes each)
num_elements = pipe_buf + 1
test_array = array.array('i', [1 for _ in range(num_elements)])

From what I can tell, the pipe is taking each integer as a single byte.

expected_bytes = test_array.tobytes()
mv = memoryview(test_array)

p = subprocess.Popen([sys.executable, "-c",
'import sys; '
'data = sys.stdin.buffer.read(); '
'sys.stdout.buffer.write(data)'],
stdin=subprocess.PIPE,
stdout=subprocess.PIPE)
self.addCleanup(p.stdout.close)
self.addCleanup(p.stdin.close)
(stdout, stderr) = p.communicate(mv)
self.assertEqual(stdout, expected_bytes)
self.assertEqual(stderr, None)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
self.assertEqual(stderr, None)
self.assertIsNone(stderr, None)


def test_communicate_timeout(self):
p = subprocess.Popen([sys.executable, "-c",
'import sys,os,time;'
Expand Down
0