triggered internal assert in matmul #153172
Labels
module: cuda
Related to torch.cuda, and CUDA support in general
module: CUDACachingAllocator
needs reproduction
Someone else needs to try reproducing the issue given the instructions. No action needed from user
triaged
This issue has been looked at a team member, and triaged and prioritized into an appropriate module
Uh oh!
There was an error while loading. Please reload this page.
🐛 Describe the bug
The error triggers when calling matmul
https://github.com/huggingface/transformers/blob/d23aae2b8c8738a12ab1b6710e60ae5866beaf9d/src/transformers/models/qwen2/modeling_qwen2.py#L116
I apologise in advance, since the tensors are quite large, I think it would be difficult to include them here.
pip freeze
Versions
[pip3] numpy==2.2.5
[pip3] nvidia-cublas-cu12==12.6.4.1
[pip3] nvidia-cuda-cupti-cu12==12.6.80
[pip3] nvidia-cuda-nvrtc-cu12==12.6.77
[pip3] nvidia-cuda-runtime-cu12==12.6.77
[pip3] nvidia-cudnn-cu12==9.5.1.17
[pip3] nvidia-cufft-cu12==11.3.0.4
[pip3] nvidia-curand-cu12==10.3.7.77
[pip3] nvidia-cusolver-cu12==11.7.1.2
[pip3] nvidia-cusparse-cu12==12.5.4.2
[pip3] nvidia-cusparselt-cu12==0.6.3
[pip3] nvidia-nccl-cu12==2.26.2
[pip3] nvidia-nvjitlink-cu12==12.6.85
[pip3] nvidia-nvtx-cu12==12.6.77
[pip3] torch==2.7.0
[pip3] triton==3.3.0
[conda] Could not collect
cc @ptrblck @msaroufim @eqy @jerryzh168
The text was updated successfully, but these errors were encountered: