multiple values for argument `softmax_scale` #101603

msaroufim · 2023-05-16T19:16:39Z

🐛 Describe the bug

Posting this on behalf of Mosaic

pip install --no-cache-dir --find-links https://download.pytorch.org/whl/torch_stable.html torch==2.0.0+cu117 torchvision torchtext
pip install einops

python attention.py

https://raw.githubusercontent.com/sashaDoubov/llm-foundry/sasha/repro-issue/llmfoundry/models/layers/attention.py

Error logs

Not an error but instead seeing this scary warning


[2023-05-15 23:04:35,681] torch._dynamo.symbolic_convert: [WARNING] /llm-foundry/llmfoundry/models/layers/attention.py <function xformers_attn_fn at 0x7f2a6834ed40> [UnspecializedNNModuleVariable(MultiheadAttention), TensorVariable(), TensorVariable(), TensorVariable(), ConstantVariable(int)] {'softmax_scale': ConstantVariable(float), 'attn_bias': TensorVariable(), 'key_padding_mask': ConstantVariable(NoneType), 'is_causal': ConstantVariable(bool), 'dropout_p': ConstantVariable(float), 'training': ConstantVariable(bool), 'needs_weights': ConstantVariable(bool)} multiple values for argument 'softmax_scale'

[2023-05-15 23:04:35,707] torch._dynamo.symbolic_convert: [WARNING] /llm-foundry/llmfoundry/models/layers/attention.py <function xformers_attn_fn at 0x7f2a6834ed40> [UnspecializedNNModuleVariable(MultiheadAttention), TensorVariable(), TensorVariable(), TensorVariable(), ConstantVariable(int)] {'softmax_scale': ConstantVariable(float), 'attn_bias': TensorVariable(), 'key_padding_mask': ConstantVariable(NoneType), 'is_causal': ConstantVariable(bool), 'dropout_p': ConstantVariable(float), 'training': ConstantVariable(bool), 'needs_weights': ConstantVariable(bool)} multiple values for argument 'softmax_scale'

How to remove the error

As a workaround you can comment out softmax_scale and the code then works https://gist.github.com/msaroufim/5fe1a5cf745e31baabeb62b8dce10c82 but that's not a real solution

Versions

n

cc @H-Huang @awgu @kwen2501 @wanchaol @fegin @fduwjj @wz337 @wconstab @d4l3k @c-p-i-o @zhaojuanmao @mrshenli @rohan-varma @chauhang @penguinwu @pritamdamania87 @satgera @gqchen @aazzolini @osalpekar @jiayisuse @XilunWu @tianyu-l @yf225 @ezyang @bdhirsh @anijain2305 @zou3519 @kiukchung @LucasLLC @ngimel

The text was updated successfully, but these errors were encountered:

msaroufim · 2023-05-16T19:17:01Z

cc @wconstab @sashaDoubov @voznesenskym

anijain2305 · 2023-05-16T20:06:24Z

Cc @mlazos if you have bandwidth

sashaDoubov · 2023-05-16T20:43:21Z

I have also tried this with DDP and don't see those warnings, making this seem specific to FSDP.

Chillee · 2023-05-22T17:36:24Z

Assigning this to @voznesenskym arbitrarily.

rakirs3333 · 2025-05-13T17:07:13Z

Do we still see this happening? I tried running the code but do not see anything.

msaroufim added the oncall: pt2 label May 16, 2023

msaroufim added the module: fsdp label May 18, 2023

Chillee assigned voznesenskym May 22, 2023

Chillee added the triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module label May 22, 2023

penguinwu added the module: distributed label Nov 29, 2023

albanD added oncall: distributed Add this issue/PR to distributed oncall triage queue and removed module: distributed labels Dec 8, 2023

yf225 assigned yf225 and unassigned voznesenskym Feb 27, 2024

yf225 removed their assignment Nov 27, 2024

yf225 added the pt2d-triage-nov2024 label Nov 27, 2024

msaroufim closed this as completed May 13, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

multiple values for argument `softmax_scale` #101603

multiple values for argument `softmax_scale` #101603

multiple values for argument softmax_scale #101603

multiple values for argument softmax_scale #101603

Comments

🐛 Describe the bug

Error logs

How to remove the error

Versions

multiple values for argument `softmax_scale` #101603

multiple values for argument `softmax_scale` #101603