MPS Regression when rendering LTXVideo (after pytorch2.4.1) #141471

cchance27 · 2024-11-25T02:58:51Z

🐛 Describe the bug

Testing on Apple MPS using ComfyUI with various PyTorch versions as on nightly and 2.5.1 result in nothing but noise, however on PyTorch 2.4.1 LTX Video model renders correctly without any issues.

Oddly the broken version of pytorch with ltxvideo is ~40% slower, 11.31s/it vs 15.89s/it ... Not sure how to narrow down more whats causing the completely borked results and slower iterations but its definitely due to pytorch changes from 2.4.1 to 2.5.1

2.4.1 Results. 11.31s/it

2.5.1 Result. 15.89s/it

Versions

Working Version:
PyTorch version: 2.4.1
Is debug build: False
CUDA used to build PyTorch: None
ROCM used to build PyTorch: N/A

OS: macOS 15.2 (arm64)
GCC version: Could not collect
Clang version: 19.1.3
CMake version: version 3.31.1
Libc version: N/A

Python version: 3.11.5 (main, Sep 11 2023, 08:31:25) [Clang 14.0.6 ] (64-bit runtime)
Python platform: macOS-15.2-arm64-arm-64bit
Is CUDA available: False
CUDA runtime version: No CUDA
CUDA_MODULE_LOADING set to: N/A
GPU models and configuration: No CUDA
Nvidia driver version: No CUDA
cuDNN version: No CUDA
HIP runtime version: N/A
MIOpen runtime version: N/A
Is XNNPACK available: True

CPU:
Apple M3 Pro

Versions of relevant libraries:
[pip3] mypy-extensions==1.0.0
[pip3] numpy==1.26.4
[pip3] torch==2.4.1
[pip3] torchaudio==2.4.1
[pip3] torchsde==0.2.6
[pip3] torchvision==0.19.1
[conda] numpy 1.24.3 py311hb57d4eb_0
[conda] numpy-base 1.24.3 py311h1d85a46_0
[conda] numpydoc 1.5.0 py311hca03da5_0
[conda] onnx2torch 1.5.14 pypi_0 pypi
[conda] torch 2.3.0 pypi_0 pypi
[conda] torchinfo 1.8.0 pypi_0 pypi
[conda] torchvision 0.18.0 pypi_0 pypi

Failing version: (also failed on nightly version i tried yesterday)
PyTorch version: 2.5.1
Is debug build: False
CUDA used to build PyTorch: None
ROCM used to build PyTorch: N/A

OS: macOS 15.2 (arm64)
GCC version: Could not collect
Clang version: 19.1.3
CMake version: version 3.31.1
Libc version: N/A

Python version: 3.11.5 (main, Sep 11 2023, 08:31:25) [Clang 14.0.6 ] (64-bit runtime)
Python platform: macOS-15.2-arm64-arm-64bit
Is CUDA available: False
CUDA runtime version: No CUDA
CUDA_MODULE_LOADING set to: N/A
GPU models and configuration: No CUDA
Nvidia driver version: No CUDA
cuDNN version: No CUDA
HIP runtime version: N/A
MIOpen runtime version: N/A
Is XNNPACK available: True

CPU:
Apple M3 Pro

Versions of relevant libraries:
[pip3] mypy-extensions==1.0.0
[pip3] numpy==1.26.4
[pip3] torch==2.5.1
[pip3] torchaudio==2.5.1
[pip3] torchsde==0.2.6
[pip3] torchvision==0.20.1
[conda] numpy 1.24.3 py311hb57d4eb_0
[conda] numpy-base 1.24.3 py311h1d85a46_0
[conda] numpydoc 1.5.0 py311hca03da5_0
[conda] onnx2torch 1.5.14 pypi_0 pypi
[conda] torch 2.3.0 pypi_0 pypi
[conda] torchinfo 1.8.0 pypi_0 pypi
[conda] torchvision 0.18.0 pypi_0 pypi

cc @ezyang @gchanan @zou3519 @kadeng @msaroufim @kulinseth @albanD @malfet @DenisVieriu97 @jhavukainen

hvaara · 2024-11-25T12:45:34Z

Can you please provide the ComfyUI workflow you're using? In the meantime I'll try to repro with the model directly.

@pytorchbot label "module: mps" "module: correctness (silent)"

Vargol · 2024-11-25T13:46:20Z

I can confirm it fails in the same way using the LTX repo code with the inference.py modified to use MPS

Vargol · 2024-11-25T13:49:17Z

inference_mps.py.zip

MPS inferences script, fails with pytorch 2.5.1 and nightlies (last nightly tried 22-11-2024) in the same way as reported by the OP.

hvaara · 2024-11-25T14:03:47Z

Found the workflow in Lightricks/LTX-Video#5.

I was also able to reproduce. Will propose device selection option to downstream. Adding repro steps using model directly (without comfy) here after that. Attempting to repro good on 2.4.1. Will run bisect if successful to find culprit.

hvaara · 2024-11-25T14:36:03Z

I was able to repro good on 2.4.1. Running a bisect to identify the culprit.

hvaara · 2024-11-25T19:08:37Z

Culprit commit identified as 861bdf9 (PR #128393).

Bisect replay log

git bisect start
# status: waiting for both good and bad commits
# bad: [e2e67a010ac359899e24cebcbd04337831d10b9b] [logging] Add dynamo_compile fields for pre-dispatch/joint/post-dispatch times (#140306)
git bisect bad e2e67a010ac359899e24cebcbd04337831d10b9b
# status: waiting for good commit(s), bad commit known
# good: [ee1b6804381c57161c477caa380a840a84167676] [Doc] Fix rendering of the unicode characters (#134695)
git bisect good ee1b6804381c57161c477caa380a840a84167676
# good: [b66e3f0957b96b058c9b632ca60833d9717a9d8a] Set simdlen based on ATEN_CPU_CAPABILITY (#123514)
git bisect good b66e3f0957b96b058c9b632ca60833d9717a9d8a
# bad: [d433a603af4f384444f9ff5b24d83d9baed398dd] [BE] use torch.amp.autocast instead of torch.cuda.amp.autocast (#134291)
git bisect bad d433a603af4f384444f9ff5b24d83d9baed398dd
# good: [93ef2e53f882161bc3f1c36aeb01971fd4174d37] [3.13, dynamo] support FORMAT_SIMPLE/FORMAT_SPEC (#130751)
git bisect good 93ef2e53f882161bc3f1c36aeb01971fd4174d37
# good: [656a4d14089dda438af36cf0b1f16c89a5b6fdfe] [6/N] Fix clang-tidy warnings in aten/src/ATen  (#132620)
git bisect good 656a4d14089dda438af36cf0b1f16c89a5b6fdfe
# good: [161cc137d21bd6c89ed81b6c49b8147927746069] [DTensor] Add naive replicate strategy for aten.triu.default and aten.tril.default (#133545)
git bisect good 161cc137d21bd6c89ed81b6c49b8147927746069
# bad: [178e8563b8a44243a6f69f3d257d9a3aab71b2c5] [dynamo] simplify polyfill registration for `builtins.all` and `builtins.any` (#133769)
git bisect bad 178e8563b8a44243a6f69f3d257d9a3aab71b2c5
# bad: [fed6096e73f692146df874e8278b41b752f1e7cb] [dynamo] Support object.__new__ call (#133746)
git bisect bad fed6096e73f692146df874e8278b41b752f1e7cb
# good: [b833990a8f7e4bb89fe8d54e7f709009d46162ce] Revert "[CUDA][CUTLASS][submodule] Fixes for CUTLASS upgrade (#131493)"
git bisect good b833990a8f7e4bb89fe8d54e7f709009d46162ce
# bad: [b0803129e8cb5813b927f6b3fea846bf2dd58a6f] Added meta registration for `_fused_adamw_` (#133728)
git bisect bad b0803129e8cb5813b927f6b3fea846bf2dd58a6f
# good: [88ba50279c3558addc68ebdde0de84997b4d1821] Consolidate the format for `--max-acc-splits` flag (#133724)
git bisect good 88ba50279c3558addc68ebdde0de84997b4d1821
# bad: [861bdf96f4dc7cbdeadd65617ed5059e4030113d] [MPS] Add native strided API for MPSNDArray starting with macOS 15 (#128393)
git bisect bad 861bdf96f4dc7cbdeadd65617ed5059e4030113d
# good: [447f428d6d6563636f95edae3fd24b42038c9ba2] [ROCm] Fix text_export cudnn_attention UT (#133234)
git bisect good 447f428d6d6563636f95edae3fd24b42038c9ba2
# first bad commit: [861bdf96f4dc7cbdeadd65617ed5059e4030113d] [MPS] Add native strided API for MPSNDArray starting with macOS 15 (#128393)

hvaara · 2024-11-25T19:17:20Z

xref golden good https://github.com/hvaara/files/blob/ec13a164c31d67b2177ba06f0a3bdb041827c255/LTX-Video/text_to_vid_0_the-waves-crash-against-the-pearly_0_480x704x121_0.good.ee1b6804381.mp4
xref golden bad https://github.com/hvaara/files/blob/ec13a164c31d67b2177ba06f0a3bdb041827c255/LTX-Video/text_to_vid_0_the-waves-crash-against-the-pearly_0_480x704x121_0.bad.e2e67a010ac.mp4
xref golden strategy nn-fuzzy 2 .9

hvaara · 2024-11-25T21:35:10Z

For the device option proposal please see Lightricks/LTX-Video#25.

The failure mode via LTX-Video can be reproduced by running

CKPT_DIR=~/.cache/huggingface/hub/models--Lightricks--LTX-Video/snapshots/a5ab70cf0b89a0b90dfafe3556c24f1b4767bdc8
PROMPT="The waves crash against the pearly white beach of the shoreline, sending spray high into the tropical air. The palm trees streches high, with beautiful green leaves and bright colors. The water is a clear blue-green, with white foam where the waves break and details of the ocean floor visible below the pure water. The sky is blue, with a few white clouds dotting the horizon."
python inference.py --ckpt_dir ${CKPT_DIR:?} --prompt ${PROMPT
8000
:?} --seed 0 --num_frames 9 --bfloat16 --device mps

with the above mentioned proposal.

Still need to figure out where in PyTorch the bug is.

hvaara · 2024-11-26T00:03:26Z

Found a crumb.

diff --git a/aten/src/ATen/native/mps/operations/Convolution.mm b/aten/src/ATen/native/mps/operations/Convolution.mm
index f0aac14814b..7f4e611898f 100644
--- a/aten/src/ATen/native/mps/operations/Convolution.mm
+++ b/aten/src/ATen/native/mps/operations/Convolution.mm
@@ -125,7 +125,7 @@ static Tensor _mps_convolution_impl(const Tensor& input_t_,
                                     int64_t groups,
                                     std::optional<IntArrayRef> input_shape) {
   const bool is_macOS_13_2_or_newer = is_macos_13_or_newer(MacOSVersion::MACOS_VER_13_2_PLUS);
-  const bool is_macOS_15_0_or_newer = is_macos_13_or_newer(MacOSVersion::MACOS_VER_15_0_PLUS);
+  const bool is_macOS_15_0_or_newer = false;
   Tensor input_t = input_t_;
   if (!is_macOS_15_0_or_newer) {
     input_t = input_t.contiguous();

produces good golden.

cchance27 · 2024-11-26T01:12:31Z

Are you also seeing the slower s/it on the failed versions as i was, seems odd since this commit was for a performance improvement, seems odd to break this path + be slower

vvuk · 2024-11-26T01:24:35Z

I've seen the slower speed with the more recent (broken) pytorch. One thought might be that because the data becomes garbage at some point, it could be getting into denorm fp values, invalid values, NaNs, etc. all that might be hitting much slower paths as they're operated on further.

Vargol · 2024-11-26T08:08:39Z

Torch 2.5.1 has performance regression from 2.4.1 (for MPS), and uses significantly more memory with things used in Image Generation stuff. they're significantly improved in the nightlies, being a little faster than 2.4.1 in the test cases but there's still a small memory increase compared to 2.4.1.

See #139389

haiggoh · 2024-11-26T09:56:49Z

Well, it's not that simple though. I tried downgrading to 2.4.1, I couldn't even get comfy to boot anymore after that. I'm now going back to torch nightly it seems. I hope the issue can be fixed, trying to go back to torch==2.4.1 doesn't seem to be an option probably dependency issues

cchance27 · 2024-11-26T16:15:49Z

ya comfy works fine on 2.4.1 out of the box, I'd imagine you've got some other dependency breaking compatibility, i know a few extensions are dependent on the autocast stuff that was added in more recent versions, but i too am hoping the team can deduce whats going on here.

It's definitely over my head, glad to see that hvaara managed to track it down to a single file/value, though still seems like we don't know why that code is breaking it, i don't understand enough about the stride api to even guess lol I looked through the code and the is_macOS_15_0_or_newer variable is only used in a few spots an doesn't seem to affect too many things outside of that scope, so hopefully its traceable

malfet · 2024-11-27T23:31:37Z

@hvaara a bit unrelated, but when I've tried to reproduce it with nightlies, using inference-mps.py script I run into an autocast issue, namely it errored out with:

Traceback (most recent call last):
  File "/Users/malfet/git/Lightricks/LTX-Video/inference-mps.py", line 438, in <module>
    main()
  File "/Users/malfet/git/Lightricks/LTX-Video/inference-mps.py", line 343, in main
    images = pipeline(
             ^^^^^^^^^
  File "/Users/malfet/miniconda3/envs/py312-comfyui/lib/python3.12/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/Users/malfet/git/Lightricks/LTX-Video/ltx_video/pipelines/pipeline_ltx_video.py", line 1026, in __call__
    noise_pred = self.transformer(
                 ^^^^^^^^^^^^^^^^^
  File "/Users/malfet/miniconda3/envs/py312-comfyui/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1739, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/malfet/miniconda3/envs/py312-comfyui/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1750, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/malfet/git/Lightricks/LTX-Video/ltx_video/models/transformers/transformer3d.py", line 452, in forward
    hidden_states = block(
                    ^^^^^^
  File "/Users/malfet/miniconda3/envs/py312-comfyui/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1739, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/malfet/miniconda3/envs/py312-comfyui/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1750, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/malfet/git/Lightricks/LTX-Video/ltx_video/models/transformers/attention.py", line 249, in forward
    attn_output = self.attn1(
                  ^^^^^^^^^^^
  File "/Users/malfet/miniconda3/envs/py312-comfyui/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1739, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/malfet/miniconda3/envs/py312-comfyui/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1750, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/malfet/git/Lightricks/LTX-Video/ltx_video/models/transformers/attention.py", line 687, in forward
    return self.processor(
           ^^^^^^^^^^^^^^^
  File "/Users/malfet/git/Lightricks/LTX-Video/ltx_video/models/transformers/attention.py", line 1027, in __call__
    hidden_states = F.scaled_dot_product_attention(
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
RuntimeError: Expected query, key, and value to have the same dtype, but got query.dtype: float key.dtype: float and value.dtype: c10::BFloat16 instead.

until I've disabled autocast completely...

hvaara · 2024-11-28T01:17:08Z

@hvaara a bit unrelated, but when I've tried to reproduce it with nightlies, using inference-mps.py script I run into an autocast issue, namely it errored out with:
Traceback (most recent call last):
[...]
  File "/Users/malfet/git/Lightricks/LTX-Video/ltx_video/models/transformers/attention.py", line 1027, in __call__
    hidden_states = F.scaled_dot_product_attention(
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
RuntimeError: Expected query, key, and value to have the same dtype, but got query.dtype: float key.dtype: float and value.dtype: c10::BFloat16 instead.
until I've disabled autocast completely...

Unrelated, but still a bug 😄 Tried to RCA it; Prior to #139390 you'd get a warning when you run LTX-Video without --bfloat16 similar to

/Users/hvaara/dev/pytorch/pytorch/torch/amp/autocast_mode.py:332: UserWarning: In MPS autocast, but the target dtype is not supported. Disabling autocast.
MPS Autocast only supports dtype of torch.float16 currently.
  warnings.warn(error_message)

because it'll then use mixed precision with bf16. Autocast was disabled because bf16 isn't supported. After #139390 bf16 is supported, so you don't get the warning anymore, but it isn't really supported for SDPA.

With

diff --git a/aten/src/ATen/autocast_mode.cpp b/aten/src/ATen/autocast_mode.cpp
index 1129892dd25..6649708c706 100644
--- a/aten/src/ATen/autocast_mode.cpp
+++ b/aten/src/ATen/autocast_mode.cpp
@@ -236,6 +236,7 @@ TORCH_LIBRARY_IMPL(aten, AutocastMPS, m) {
   KERNEL_MPS(chain_matmul, lower_precision_fp)
   KERNEL_MPS(linalg_multi_dot, lower_precision_fp)
   KERNEL_MPS(lstm_cell, lower_precision_fp)
+  KERNEL_MPS(scaled_dot_product_attention, lower_precision_fp)

   // fp32
   KERNEL_MPS(acos, fp32)

autocast works for me.

hvaara · 2024-11-28T05:18:13Z

Think I got to the bottom of it. Looks like the RC is a memory format issue (ChannelsLast3d).

Fix incoming.

cchance27 · 2024-11-28T13:58:35Z

Look forward to it trying to learn more about these type of weird bugs

hvaara · 2024-11-28T18:13:29Z

Fix proposal in #141780.

xref https://gist.github.com/hvaara/340bc4bf740d97c15351db7b6759643d

haqatak · 2024-11-30T20:05:11Z

Hi @hvaara did it work for you?
I installed your version of pytorch on my mac and I still get the same noise on LTX

torch.version
'2.6.0a0+gitffc68ab'

hvaara · 2024-11-30T21:25:14Z

@haqatak yes, it worked for me, but I might have run it differently than you. What's the command/code you ran? I'd like to try to repro so I can debug it.

@DenisVieriu97

…`nn.Conv3d` (pytorch#141780) When the input tensor to Conv3d is in the channels_last_3d memory format the Conv3d op will generate incorrect output (see example image in pytorch#141471). This PR checks if the op is 3d, and then attempts to convert the input tensor to contiguous. Added a regression test that verifies the output by running the same op on the CPU. I'm unsure if Conv3d supports the channels last memory format after pytorch#128393. If it does, we should consider updating the logic to utilize this as it would be more efficient. Perhaps @DenisVieriu97 knows or has more context? Fixes pytorch#141471 Pull Request resolved: pytorch#141780 Approved by: https://github.com/malfet

vvuk · 2024-12-02T16:58:33Z

Hm, I can't test this from nightlies -- both torchaudio and torchvision nightly publishing for macOS is broken as of a week ago, and torchvision 20241126 doesn't seem to work with torch 20241202 ("operator torchvision::nms does not exist")

hvaara · 2024-12-02T18:40:27Z

Unfortunately #141780 didn't make it into torch-2.6.0.dev20241202, but it should be in torch-2.6.0.dev20241203.

The torchvision release process for macos does indeed look broken. From https://download.pytorch.org/whl/nightly/cpu/torchvision/ I can see that the last successful release seems to be torchvision-0.20.0.dev20241126. The release process workflow also fails with Error when evaluating 'strategy' for job 'build'. pytorch/test-infra/.github/workflows/build_conda_macos.yml@main (Line: 82, Col: 15): Matrix must define at least one vector (https://github.com/pytorch/vision/actions/runs/12118493133). I'll create a bug in vision to investigate.

If vision still fails to promote a nightly you might have to compile it manually after installing the torch nightly if you want to test the changes from #141780.

vvuk · 2024-12-02T19:17:57Z

Ah, thanks! I wasn't sure if the nightly builds failing warranted filing an issue, now I know :)

@DenisVieriu97

…`nn.Conv3d` (pytorch#141780) When the input tensor to Conv3d is in the channels_last_3d memory format the Conv3d op will generate incorrect output (see example image in pytorch#141471). This PR checks if the op is 3d, and then attempts to convert the input tensor to contiguous. Added a regression test that verifies the output by running the same op on the CPU. I'm unsure if Conv3d supports the channels last memory format after pytorch#128393. If it does, we should consider updating the logic to utilize this as it would be more efficient. Perhaps @DenisVieriu97 knows or has more context? Fixes pytorch#141471 Pull Request resolved: pytorch#141780 Approved by: https://github.com/malfet

changeling · 2024-12-05T14:30:17Z

The issue of nightly builds seems to be fixed, looking at the nightly downloads.

Looks like, on my first test, the regression looks to be addressed. I generated video without the noise issue, clean and clear, using:

torch-2.6.0.dev20241205
torchaudio-2.5.0.dev20241205
torchvision-0.20.0.dev20241205

hvaara · 2024-12-05T21:17:48Z

Awesome! Thanks a lot everyone for collaborating on this issue! 😄

peterdn1 · 2024-12-06T01:30:09Z

pip install --pre torch==2.6.0.dev20241205 torchvision==0.20.0.dev20241205 torchaudio==2.5.0.dev20241205 --extra-index-url https://download.pytorch.org/whl/nightly/cpu
ERROR: Cannot install torch==2.6.0.dev20241205 and torchvision==0.20.0.dev20241205 because these package versions have conflicting dependencies.
The conflict is caused by:
The user requested torch==2.6.0.dev20241205
torchvision 0.20.0.dev20241205 depends on torch==2.6.0.dev20241204

This worked for me
pip install --pre torchvision==0.20.0.dev20241205 --extra-index-url https://download.pytorch.org/whl/nightly/cpu

pachieh · 2024-12-07T16:10:53Z

What's the latest pip install for this to work? The one from @peterdn1 didn't work.

ERROR: Ignored the following yanked versions: 0.1.6, 0.1.7, 0.1.8, 0.1.9, 0.2.0, 0.2.1, 0.2.2, 0.2.2.post2, 0.2.2.post3, 0.11.0, 0.15.0
ERROR: Could not find a version that satisfies the requirement torchvision==0.20.0.dev20241205 (from versions: 0.8.2, 0.9.0, 0.9.1, 0.10.0, 0.10.1, 0.11.1, 0.11.2, 0.11.3, 0.12.0, 0.13.0, 0.13.1, 0.14.0, 0.14.1, 0.15.1, 0.15.2, 0.16.0, 0.16.1, 0.16.2, 0.17.0, 0.17.1, 0.17.2)
ERROR: No matching distribution found for torchvision==0.20.0.dev20241205

On Sequoia 15.1.1 with an M3 Max.

hvaara · 2024-12-08T15:26:19Z

@pachieh please post the command you ran including the entire output on https://gist.github.com/ and add a link here.

If you try a new environment (venv, conda, or whatever you're using), does it work when you try to install with the recommended install procedure for nightly from https://pytorch.org/ (ie. pip3 install --pre torch torchvision torchaudio --index-url https://download.pytorch.org/whl/nightly/cpu)?

atalman · 2025-01-21T13:11:08Z

Confirmed - this is fixed in release 2.6:

pytest -v test_mps.py -v -k test_conv3d_channels_last_3d                                              
====================================================================== test session starts ======================================================================
platform darwin -- Python 3.10.13, pytest-8.3.4, pluggy-1.5.0 -- /Users/atalman/miniconda3/envs/py310/bin/python
cachedir: .pytest_cache
rootdir: /Users/atalman/Downloads/release26/pytorch
configfile: pytest.ini
collected 6298 items / 6297 deselected / 1 selected                                                                                                             
Running 1 items in this shard

test_mps.py::TestNNMPS::test_conv3d_channels_last_3d PASSED [1.5335s]

cchance27 mentioned this issue Nov 25, 2024

Running on macbook getting noise as output Lightricks/LTX-Vi 8000 deo#5

Closed

cchance27 changed the title ~~MPS Regression when rendering LTX Model (after pytorch2.4.1)~~ MPS Regression when rendering LTXVideo (after pytorch2.4.1) Nov 25, 2024

pytorch-bot bot added module: correctness (silent) issue that returns an incorrect result silently module: mps Related to Apple Metal Performance Shaders framework labels Nov 25, 2024

hvaara added a commit to hvaara/files that referenced this issue Nov 25, 2024

Add example bad output for pytorch/pytorch#141471 using nightly

8ce9a06

hvaara added a commit to hvaara/files that referenced this issue Nov 25, 2024

Add example good output for pytorch/pytorch#141471 using v2.4.1

b0384c3

hvaara mentioned this issue Nov 25, 2024

Add command line flag to specify device for inference Lightricks/LTX-Video#25

Closed

bdhirsh added the high priority label Nov 26, 2024

pytorch-bot bot added the triage review label Nov 26, 2024

haiggoh mentioned this issue Nov 26, 2024

incoherent output on mps Lightricks/ComfyUI-LTXVideo#43

Closed

malfet added this to the 2.6.0 milestone Nov 27, 2024

malfet added the module: regression It used to work, and now it doesn't label Nov 27, 2024

This was referenced Nov 28, 2024

[MPS] Autocast fails for F.scaled_dot_product_attention #141774

Closed

[MPS] Convert channels_last_3d to contiguous for input tensor in nn.Conv3d #141780

Closed

malfet mentioned this issue Dec 1, 2024

StrideAPI caused regression in channels-last logic #141836

Open

pytorchmergebot closed this as completed in 90f19fe Dec 1, 2024

This was referenced Dec 2, 2024

[RelEng] torchvision macOS nightly promotion failed since 2024-11-27 pytorch/vision#8778

Closed

[RelEng] torchaudio macOS nightly promotion failed since 2024-11-27 pytorch/audio#3858

Closed

atalman mentioned this issue Jan 13, 2025

Release 2.6.0 validations checklist and cherry-picks #144503

Closed

73 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

MPS Regression when rendering LTXVideo (after pytorch2.4.1) #141471

MPS Regression when rendering LTXVideo (after pytorch2.4.1) #141471

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

MPS Regression when rendering LTXVideo (after pytorch2.4.1) #141471

MPS Regression when rendering LTXVideo (after pytorch2.4.1) #141471

Comments

Uh oh!

🐛 Describe the bug

Versions

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Bisect replay log

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!