8000 PyTorch VS2022 build Windows binary illegal instruction on AVX2(max ISA level) CPU · Issue #145042 · pytorch/pytorch · GitHub
[go: up one dir, main page]

Skip to content
PyTorch VS2022 build Windows binary illegal instruction on AVX2(max ISA level) CPU #145042
@xuhancn

Description

@xuhancn

🐛 Describe the bug

Background

Intel Team found the PyTorch Windows XPU nightly build binary occurred illegal instruction on AVX2(max ISA level) CPU, the original issue is here: intel/torch-xpu-ops#1173

Reproduce steps:
Install the PyTorch Windows XPU binary, and then run it on Intel client CPU(max ISA level is AVX2).

Example, use 2024-12-11 nightly build:

python -m pip install https://download.pytorch.org/whl/nightly/xpu/torch-2.6.0.dev20241211%2Bxpu-cp39-cp39-win_amd64.whl

Reproduce code:

import torch
class TestClass:
    def test_grid_sampler_2d(self):
        torch.manual_seed(0)
        b = torch.rand(2, 13, 10, 2, dtype=torch.float64)
        a = torch.rand(2, 3, 5, 20, dtype=torch.float64)
        torch.grid_sampler_2d(a, b, interpolation_mode=0, padding_mode=0, align_corners=False)

and it will occur the illegal instruction.

Debug Note:

  1. Intel team tried to build PyTorch Windows XPU binary locally, but we can't reproduce the issue.

  2. Intel Team tried to debug the official binary via WinDBG.

Image

WinDBG catched up the issue, it is genarated AVX512 instruction and it is raised illegal instruction on AVX2 max ISA CPU.
But we can't locate the issue to source level. Due to our missing debug symbol (.pdb) files. PyTorch has some issue to genarate the .pdb file.

  1. We tried to switch PyTorch Windows CPU(only) build to VS2022: [don't merge] use vs2022 build windows cpu wheel. #143791
    We tested the PyTorch Windows CPU(only) binary, which build by the PR. The issue can reproduced.

Conclusion:

  1. It is only occurred on PyTorch official build system, and Visual studio version must be 2022.
  2. The illegal instruction is caused by compiler genarated AVX512 instruction for AVX2 ISA.
  3. Due to item 2, it is only occurred on AVX2 (max ISA) CPU

Solution

Option 1: Fix the pytorch official build system, if we want to switch PyTorch Windows CPU build to VS2022, in the further.
Because of we can't reproduce the issue locally. Suggest involev Microsoft PyTorch team or Microsoft Visual Studio team. The reproduce PR is: #143791

Option 2: Intel PyTorch team downgrade PyTorch Windows XPU build to VS2019.

Versions

NA

cc @peterjc123 @mszhanyi @skyline75489 @nbcsm @iremyux @Blackhex @jgong5 @mingfeima @XiaobingSuper @sanchitintel @ashokei @jingxu10

< 5CCB /div>

Metadata

Metadata

Assignees

No one assigned

    Labels

    low priorityWe're unlikely to get around to doing this in the near futuremodule: cpuCPU specific problem (e.g., perf, algorithm)module: windowsWindows support for PyTorchtriagedThis issue has been looked at a team member, and triaged and prioritized into an appropriate module

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions

      0