8000 nn.Linear forward error on AArch64 if the out_features equals to 1 · Issue #110149 · pytorch/pytorch · GitHub
[go: up one dir, main page]

Skip to content
nn.Linear forward error on AArch64 if the out_features equals to 1 #110149
@imzhuhl

Description

@imzhuhl

🐛 Describe the bug

I'm on AArch64 device, and there is a runtime error when run a Linear's forward function if the Linear's out_features is 1.

Here is the code:

class Net(nn.Module):
    def __init__(self, in_dim, out_dim):
        super().__init__()
        self.fc = nn.Linear(in_dim, out_dim)
    
    def forward(self, x):
        x = self.fc(x)
        return x

model = Net(512, 1)
x = torch.randn((2, 100,  512))
model.eval()
with torch.no_grad():
    model(x)

Here is the error message:

  File "/root/alibaba/xcode-ml/pytorch/torch/nn/modules/linear.py", line 114, in forward
    return F.linear(input, self.weight, self.bias)
RuntimeError: could not create a primitive descriptor for a matmul primitive

I use pytorch main branch, and build pytorch from source with ONEDNN and ACL.

Versions

Collecting environment information...
PyTorch version: 2.2.0a0+gita51b8df
Is debug build: False
CUDA used to build PyTorch: None
ROCM used to build PyTorch: N/A

OS: Alibaba Cloud Linux release 3 (Soaring Falcon)  (aarch64)
GCC version: (GCC) 10.2.1 20200825 (Alibaba 10.2.1-3.5 2.32)
Clang version: Could not collect
CMake version: version 3.27.2
Libc version: glibc-2.32

Python version: 3.10.12 (main, Jul  5 2023, 18:45:42) [GCC 11.2.0] (64-bit runtime)
Python platform: Linux-5.10.134-14.1.al8.aarch64-aarch64-with-glibc2.32
Is CUDA available: False
CUDA runtime version: No CUDA
CUDA_MODULE_LOADING set to: N/A
GPU models and configuration: No CUDA
Nvidia driver version: No CUDA
cuDNN version: No CUDA
HIP runtime version: N/A
MIOpen runtime version: N/A
Is XNNPACK available: False

CPU:
架构:           aarch64
字节序:         Little Endian
CPU:             32
在线 CPU 列表:  0-31
每个核的线程数: 1
每个座的核数:   32
座:             1
NUMA 节点:      1
厂商 ID:        ARM
BIOS Vendor ID:  Alibaba Cloud
型号:           0
型号名称:       Neoverse-N2
BIOS Model name: virt-rhel7.6.0
步进:           r0p0
CPU MHz:        2750.000
CPU 最大 MHz:   2750.0000
CPU 最小 MHz:   2750.0000
BogoMIPS:       100.00
L1d 缓存:       64K
L1i 缓存:       64K
L2 缓存:        1024K
L3 缓存:        65536K
NUMA 节点0 CPU: 0-31
标记:           fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma lrcpc dcpop sha3 sm3 sm4 asimddp sha512 sve asimdfhm dit uscat ilrcpc flagm ssbs sb dcpodp sve2 sveaes svepmull svebitperm svesha3 svesm4 flagm2 frint svei8mm svebf16 i8mm bf16 dgh

Versions of relevant libraries:
[pip3] numpy==1.23.5
[pip3] pytorch-wpe==0.0.1
[pip3] rotary-embedding-torch==0.2.7
[pip3] torch==2.2.0a0+gita51b8df
[pip3] torch-complex==0.4.3
[pip3] torchaudio==2.2.0a0+4dc06ce
[pip3] torchvision==0.15.2
[conda] numpy                     1.23.5                   pypi_0    pypi
[conda] pytorch-wpe               0.0.1                    pypi_0    pypi
[conda] rotary-embedding-torch    0.2.7                    pypi_0    pypi
[conda] torch                     2.2.0a0+gita51b8df           dev_0    <develop>
[conda] torch-complex             0.4.3                    pypi_0    pypi
[conda] torchaudio                2.2.0a0+4dc06ce           dev_0    <develop>
[conda] torchvision               0.15.2                   pypi_0    pypi

cc @malfet

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugmodule: armRelated to ARM architectures builds of PyTorch. Includes Apple M1triagedThis issue has been looked at a team member, and triaged and prioritized into an appropriate module

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions

      0