Fix MQA V2 #2388

laclouis5 · 2025-01-01T14:43:13Z

This PR fixes the MultiQueryAttentionV2 module with has several issues:

It does not scale q @ k,
The output transpose was missing,
the output reshape used the input dim in 8000 stead of the output_dim.

anukaal

looks good

HuggingFaceDocBuilderDev · 2025-01-01T20:36:30Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

Fix MQA V2 scale and out shape

6171e75

anukaal approved these changes Jan 1, 2025

View reviewed changes

Merge branch 'main' into fix-mqa-v2

2d5277e

rwightman merged commit d23facd into huggingface:main Jan 2, 2025
22 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Fix MQA V2 #2388

Fix MQA V2 #2388

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Uh oh!

Fix MQA V2 #2388

Fix MQA V2 #2388

Uh oh!

Conversation

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants