Release 2.5.1 validations checklist and cherry-picks #138876

kit1980 · 2024-10-25T01:02:15Z

Similar to #137492

Manual validations:

Python 3.13 wheel validate @kit1980

  pip3 install torch==2.5.1 --index-url https://download.pytorch.org/whl/test/cu124
  
  Successfully installed MarkupSafe-3.0.2 filelock-3.13.1 fsspec-2024.6.1 jinja2-3.1.4 mpmath-1.3.0 networkx-3.3 nvidia-cublas-cu12-12.4.5.8 nvidia-cuda-cupti-cu12-12.4.127 nvidia-cuda-nvrtc-cu12-12.4.127 nvidia-cuda-runtime-cu12-12.4.127 nvidia-cudnn-cu12-9.1.0.70 nvidia-cufft-cu12-11.2.1.3 nvidia-curand-cu12-10.3.5.147 nvidia-cusolver-cu12-11.6.1.9 nvidia-cusparse-cu12-12.3.1.170 nvidia-nccl-cu12-2.21.5 nvidia-nvjitlink-cu12-12.4.127 nvidia-nvtx-cu12-12.4.127 setuptools-70.0.0 sympy-1.13.1 torch-2.5.1+cu124 typing-extensions-4.12.2
  
  python3 -c "import torch; print(torch.__version__, torch.cuda.is_available())"
  
  /home/sdym/miniconda3/envs/py313/lib/python3.13/site-packages/torch/_subclasses/functional_tensor.py:295: UserWarning: Failed to initialize NumPy: No module named 'numpy' (Triggered internally at ../torch/csrc/utils/tensor_numpy.cpp:84.)
  cpu = _conversion_method_template(device=torch.device("cpu"))
  2.5.1+cu124 True

CUDA pypi binaries with slimmed dependencies are usable in standard AWS containers amazonlinux2023 @kit1980

  docker run -it --gpus=all --net=host amazonlinux:2023 bash
            
  dnf update
  dnf -y install python3-pip
  pip3 install torch==2.5.1 --index-url https://download.pytorch.org/whl/test/cu124
  python3 -c "import torch; print(torch.cuda.is_available())"

CUDA pypi binaries with slimmed dependencies are usable on almalinux/9-base @kit1980

  docker run -it --gpus=all --net=host almalinux/9-base bash
            
  dnf update
  dnf -y install python3-pip
  pip3 install torch==2.5.1 --index-url https://download.pytorch.org/whl/test/cu124
  python3 -c "import torch; print(torch.cuda.is_available())"

CUDA pypi binaries with slimmed dependencies are usable on default latest Ubuntu @kit1980

  docker run -it --gpus=all --net=host ubuntu bash

  apt-get update
  apt-get install python3-pip
  pip3 install torch==2.5.1 --index-url https://download.pytorch.org/whl/test/cu124 --break-system-packages
  python3 -c "import torch; print(torch.__version__, torch.cuda.is_available())"

PyTorch can be imported without a warning on aarch64 system @malfet

% docker run --rm -it python:3.11 bash -c "pip install numpy torch --quiet --index-url https://download.pytorch.org/whl/test/cpu;python -c 'import torch;print(torch.__version__)'"
WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv
2.5.1

PyTorch MPS regression crash @malfet

% python -c "import torch; x=torch.rand(16, 16, device='mps', dtype=torch.float16); print(torch.__version__, torch.backends.mps.is_macos_or_newer(15, 0)); print(x[:,0:2].view(torch.float32) + 1)"
2.5.1 True
tensor([[1.0000],
        [1.0006],
        [1.0008],
        [1.0001],
        [1.0000],
        [1.0011],
        [1.0000],
        [1.0000],
        [1.0000],
        [1.0016],
        [1.0000],
        [1.0000],
        [1.0001],
        [1.0001],
        [1.0000],
        [1.0004]], device='mps:0')

Verify that [SDPA-CUDNN] Make CuDNN Attention Opt in #138522 actually does what it supposed to do and mitigates the issues listed in the PR description and [Performance] [CuDNN-Attention] CuDNN backend should return the output in the same stride order as input Query #138340 @drisspg
Verify that Disabling amp context when invoking compiler #138659 fixes Crash When Using torch.compile with Math scaled_dot_product_attention in AMP Mode #133974 @eellison

The text was updated successfully, but these errors were encountered:

Skylion007 · 2024-10-27T15:38:40Z

#139005 Sigh... our dependency graph is a bit messed up

kit1980 · 2024-10-27T16:46:44Z

@xuhancn Please do not edit this issue.

kit1980 · 2024-10-27T16:49:37Z

@Skylion007 We should fix that, but 2.5.1 is an emergency patch release for specific regressions that should be completed very soon. If anyone still uses Python 3.8, they should explicitly limit the upped version of torch.

xuhancn · 2024-10-28T14:46:16Z

@xuhancn Please do not edit this issue.

Got it, sorry.

malfet · 2024-10-28T16:02:04Z

See #138971 - this would be a regression between 2.5.0 and 2.5.1 and we should avoid it

huydhn · 2024-10-29T21:03:39Z

All validations for 2.5.1 have been done. Thank @kit1980 @atalman @malfet

malfet added the oncall: releng In support of CI and Release Engineering label Oct 25, 2024

github-project-automation bot added this to PyTorch OSS Dev Infra Oct 25, 2024

kit1980 added this to the 2.5.1 milestone Oct 25, 2024

huydhn closed this as completed Oct 29, 2024

github-project-automation bot moved this to Done in PyTorch OSS Dev Infra Oct 29, 2024

kiwik mentioned this issue Nov 14, 2024

Monthly issue metrics report kiwik/os-version-checker#65

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Release 2.5.1 validations checklist and cherry-picks #138876

Release 2.5.1 validations checklist and cherry-picks #138876

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Release 2.5.1 validations checklist and cherry-picks #138876

Release 2.5.1 validations checklist and cherry-picks #138876

Comments

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!