Hotfix: Flash Attention 2 support in Pixtral #38146

uminaty · 2025-05-15T09:10:05Z

Context

Pixtral support for ALL_ATTENTION_FUNCTIONS was added in this PR, but a subsequent rebase unintentionally modified a line that sets attention_mask to None when using Flash Attention 2.

Currently, without this condition, using Flash Attention 2 with Pixtral raises the following error:

RuntimeError: cu_seqlens_q must have shape (batch_size + 1)

Setting attention_mask to None resolves the issue. It also appears that the current tests doesn’t catch this case.

cc: @zucchini-nlp, @ArthurZucker

github-actions · 2025-05-15T09:10:19Z

Hi 👋, thank you for opening this pull request! The pull request is converted to draft by default. The CI will be paused while the PR is in draft mode. When it is ready for review, please click the Ready for review button (at the bottom of the PR page). This will assign reviewers and trigger CI.

zucchini-nlp

Great, thanks a lot! Moving our convo here, I realize that adding a proper test for all models would take some time. No problem for me then, I will add it o my todo :)

zucchini-nlp · 2025-05-15T09:26:05Z

run-slow: pixtral

github-actions · 2025-05-15T09:27:32Z

This comment contains run-slow, running the specified jobs:

models: ['models/pixtral']
quantizations: [] ...

HuggingFaceDocBuilderDev · 2025-05-15T09:39:01Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

setting attention_mask to None when flash_attention_2 is selected

e391142

github-actions bot marked this pull request as draft May 15, 2025 09:10

uminaty marked this pull request as ready for review May 15, 2025 09:10

github-actions bot requested review from ArthurZucker and zucchini-nlp May 15, 2025 09:10

uminaty mentioned this pull request May 15, 2025

[VLMs] support attention backends #37576

Merged

zucchini-nlp approved these changes May 15, 2025

View reviewed changes

zucchini-nlp merged commit b11b28c into huggingface:main May 15, 2025
17 of 18 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Hotfix: Flash Attention 2 support in Pixtral #38146

Hotfix: Flash Attention 2 support in Pixtral #38146

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Hotfix: Flash Attention 2 support in Pixtral #38146

Hotfix: Flash Attention 2 support in Pixtral #38146

Uh oh!

Conversation

Context

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!