[inductor] [cpu] `nn.Tanhshrink-atan2` output inconsistent results with eager #148241

shaoyuyoung · 2025-03-01T05:10:35Z

🐛 Describe the bug

symptom description: when using nn.Tanhshrink-atan2 together, output is inconsistent with eager.
device backend: only CPP
repro

import torch
import torch.nn as nn
import torch.nn.functional as F
from torch._inductor import config
import os
config.fallback_random = True
torch.set_grad_enabled(False)
torch.manual_seed(0)



class Model(torch.nn.Module):
    def __init__(self):
        super(Model, self).__init__()
        self.shrink = nn.Tanhshrink()

    def forward(self, x):
        x = self.shrink(x)
        x = torch.atan2(x, x)
        return x


model = Model()


x = torch.randn(1, 3, 64, 64)

inputs = [x]



def run_test(model, inputs, backend):
    if backend != "eager":
        model = torch.compile(model, backend=backend)
    torch.manual_seed(0)
    output = model(*inputs)
    return output


output = run_test(model, inputs, 'eager')
c_output = run_test(model, inputs, 'inductor')

print(torch.allclose(output, c_output, 1e-3, 1e-3, equal_nan=True))
print(torch.max(torch.abs(output - c_output)))

Error logs

CPP

False
tensor(3.1416)

Triton

True
tensor(0., device='cuda:0')

Versions

nightly 20250225

cc @chauhang @penguinwu

The text was updated successfully, but these errors were encountered:

**Summary** Fix pytorch#148241, The previous vectorized code generation for `tanh` used a decomposed implementation, leading to numerical differences that were further amplified by `atan2`. For example, in the given test case after `tanh`, the eager output at `[0,0,11,47]` was `-5.820766091346741e-10`, while the compiled output was `1.4319084584712982e-08`, resulting in different `atan2` outputs of `-2.3561` and `0.7853`. This issue is fixed by switching to the Sleef implementation. **Test Plan** ``` python -u -m pytest -s -v test/inductor/test_cpu_repro.py -k test_tanh_atan2 ``` Pull Request resolved: pytorch#148254 Approved by: https://github.com/malfet, https://github.com/jgong5

shaoyuyoung added the oncall: pt2 label Mar 1, 2025

leslie-fang-intel self-assigned this Mar 1, 2025

leslie-fang-intel added the oncall: cpu inductor CPU Inductor issues for Intel team to triage label Mar 1, 2025

leslie-fang-intel mentioned this issue Mar 1, 2025

[Inductor][CPP] Fix the vec codegen for tanh #148254

Closed

pytorchmergebot closed this as completed in 165e335 Mar 3, 2025

leslie-fang-intel mentioned this issue Mar 14, 2025

[inductor][cpu]performance regression in 2025-03-10 nightly release #149116

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[inductor] [cpu] `nn.Tanhshrink-atan2` output inconsistent results with eager #148241

[inductor] [cpu] `nn.Tanhshrink-atan2` output inconsistent results with eager #148241

[inductor] [cpu] nn.Tanhshrink-atan2 output inconsistent results with eager #148241

[inductor] [cpu] nn.Tanhshrink-atan2 output inconsistent results with eager #148241

Comments

Uh oh!

🐛 Describe the bug

Error logs

Versions

[inductor] [cpu] `nn.Tanhshrink-atan2` output inconsistent results with eager #148241

[inductor] [cpu] `nn.Tanhshrink-atan2` output inconsistent results with eager #148241