[MPSHooks] Release pending command encoder #164093

malfet · 2025-09-29T02:00:48Z

Stack from ghstack (oldest at bottom):

-> [MPSHooks] Release pending command encoder #164093

Before returning a comand buffer, as subsequent calle are very likely to allocate their own encoder, which results in the following runtime error

 tryCoalescingPreviousComputeCommandEncoderWithConfig:nextEncoderClass:]:1090: failed assertion `A command encoder is already encoding to this command buffer'

Added regression test to test_mps_extension

Please note, that torch::mps::get_command_buffer() should be called with dispatch_queue held, both before and after this change, but many implementations skip that

Fixes #163721

Before returning a comand buffer, as subsequent calle are very likely to allocate their own encoder, which results in the following runtime error ``` tryCoalescingPreviousComputeCommandEncoderWithConfig:nextEncoderClass:]:1090: failed assertion `A command encoder is already encoding to this command buffer' ``` Fixes #163721 [ghstack-poisoned]

pytorch-bot · 2025-09-29T02:00:52Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/164093

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 4331f79 with merge base 6ba83e0 ():
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

Before returning a comand buffer, as subsequent calle are very likely to allocate their own encoder, which results in the following runtime error ``` tryCoalescingPreviousComputeCommandEncoderWithConfig:nextEncoderClass:]:1090: failed assertion `A command encoder is already encoding to this command buffer' ``` Fixes #163721 ghstack-source-id: b214852 Pull Request resolved: #164093

To avoid regression on MacOS 26 Fixes #164093 [ghstack-poisoned]

To avoid regression on MacOS 26 Fixes #164093 ghstack-source-id: 65e18a6 Pull Request resolved: #164108

Skylion007 · 2025-09-29T17:24:58Z

test/cpp_extensions/mps_extension.mm


+void mps_add_one_new_encoder(const at::Tensor& input) {
+  using namespace at::native::mps;
+  TORCH_CHECK(input.is_mps());


Use TORCH_CHECK_VALUE for these two checks

It does not matter, as this is an internal test, I'm unsure why they contain those checks to begin with, as they are tests no examples

malfet · 2025-09-29T17:48:23Z

@pytorchbot merge -f "Lint + MPS are green"

pytorchmergebot · 2025-09-29T17:49:54Z

Merge started

Your change will be merged immediately since you used the force (-f) flag, bypassing any CI checks (ETA: 1-5 minutes). Please use -f as last resort and instead consider -i/--ignore-current to continue the merge ignoring current failures. This will allow currently pending tests to finish and report signal before the merge.

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

Camyll · 2025-10-01T16:44:54Z

@pytorchbot cherry-pick --onto release/2.9 --c critical

Before returning a comand buffer, as subsequent calle are very likely to allocate their own encoder, which results in the following runtime error ``` tryCoalescingPreviousComputeCommandEncoderWithConfig:nextEncoderClass:]:1090: failed assertion `A command encoder is already encoding to this command buffer' ``` Added regression test to `test_mps_extension` Please note, that `torch::mps::get_command_buffer()` should be called with dispatch_queue held, both before and after this change, but many implementations skip that Fixes #163721 Pull Request resolved: #164093 Approved by: https://github.com/atalman, https://github.com/Skylion007 (cherry picked from commit 8f32adc)

pytorchbot · 2025-10-01T16:49:59Z

Cherry picking #164093

The cherry pick PR is at #164365 and it is recommended to link a critical cherry pick PR with an issue. The following tracker issues are updated:

[v.2.9.0] Release Tracker #162497 (comment)

Details for Dev Infra team

Raised by workflow job

[MPSHooks] Release pending command encoder (#164093) Before returning a comand buffer, as subsequent calle are very likely to allocate their own encoder, which results in the following runtime error ``` tryCoalescingPreviousComputeCommandEncoderWithConfig:nextEncoderClass:]:1090: failed assertion `A command encoder is already encoding to this command buffer' ``` Added regression test to `test_mps_extension` Please note, that `torch::mps::get_command_buffer()` should be called with dispatch_queue held, both before and after this change, but many implementations skip that Fixes #163721 Pull Request resolved: #164093 Approved by: https://github.com/atalman, https://github.com/Skylion007 (cherry picked from commit 8f32adc) Co-authored-by: Nikita Shulga <nikita.shulga@gmail.com>

malfet requested a review from kulinseth as a code owner September 29, 2025 02:00

pytorch-bot bot added ciflow/mps Run MPS tests (subset of trunk) release notes: mps Release notes category labels Sep 29, 2025

malfet requested review from Skylion007 and dcci September 29, 2025 05:25

malfet added the topic: bug fixes topic category label Sep 29, 2025

malfet added a commit that referenced this pull request Sep 29, 2025

[MPS] Chunk fillBuffer into 4Gb slices

65cb028

To avoid regression on MacOS 26 Fixes #164093 [ghstack-poisoned]

malfet mentioned this pull request Sep 29, 2025

[MPS] Chunk fillBuffer into 4Gb slices #164108

Closed

malfet added a commit that referenced this pull request Sep 29, 2025

[MPS] Chunk fillBuffer into 4Gb slices

f7609a2

To avoid regression on MacOS 26 Fixes #164093 ghstack-source-id: 65e18a6 Pull Request resolved: #164108

atalman approved these changes Sep 29, 2025

View reviewed changes

Skylion007 reviewed Sep 29, 2025

View reviewed changes

Skylion007 approved these changes Sep 29, 2025

View reviewed changes

pytorchmergebot added the merging label Sep 29, 2025

pytorchmergebot closed this in 8f32adc Sep 29, 2025

pytorchmergebot added Merged and removed merging labels Sep 29, 2025

pytorchbot mentioned this pull request Oct 1, 2025

[MPSHooks] Release pending command encoder #164365

Merged

pytorchbot mentioned this pull request Oct 1, 2025

[v.2.9.0] Release Tracker #162497

Closed

github-actions bot deleted the gh/malfet/538/head branch November 1, 2025 02:19

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[MPSHooks] Release pending command encoder #164093

[MPSHooks] Release pending command encoder #164093

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

[MPSHooks] Release pending command encoder #164093

[MPSHooks] Release pending command encoder #164093

Uh oh!

Conversation

Uh oh!

Uh oh!

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/164093

✅ No Failures

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Merge started

Uh oh!

Uh oh!

Cherry picking #164093

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants