8000 [FlexAttention] Fix device test instantation by drisspg · Pull Request #151846 · pytorch/pytorch · GitHub
[go: up one dir, main page]

Skip to content

[FlexAttention] Fix device test instantation #151846

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 17 commits into from

Conversation

drisspg
Copy link
Contributor
@drisspg drisspg commented Apr 21, 2025

[ghstack-poisoned]
Copy link
pytorch-bot bot commented Apr 21, 2025

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/151846

Note: Links to docs will display an error until the docs builds have been completed.

⏳ No Failures, 5 Pending

As of commit 7703789 with merge base cc793e8 (image):
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

drisspg added a commit that referenced this pull request Apr 21, 2025
ghstack-source-id: d7b5533
Pull Request resolved: #151846
cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx ipiszy chenyang78 kadeng muchulee8 amjames chauhang aakhundov

[ghstack-poisoned]
drisspg added a commit that referenced this pull request Apr 21, 2025
ghstack-source-id: 21f6341
Pull Request resolved: #151846
[ghstack-poisoned]
drisspg added a commit that referenced this pull request Apr 22, 2025
ghstack-source-id: fa73de9
Pull Request resolved: #151846
drisspg added 3 commits April 21, 2025 18:59
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
B = 4
H = 8
S = 2048
B = 2
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

purposeful

drisspg added 3 commits April 21, 2025 21:14
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
Divigroup-RAP pushed a commit to Divigroup-RAP/PYTORCH that referenced this pull request Apr 22, 2025
drisspg added 3 commits April 22, 2025 10:44
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
@drisspg drisspg requested a review from BoyuanFeng April 22, 2025 19:23
drisspg added 2 commits April 22, 2025 14:36
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
@@ -90,6 +94,18 @@ def temp_float32_matmul_precision(precision: str):
torch.set_float32_matmul_precision(original_precision)


def skip_on_cpu(test_func):
"""Decorator to skip tests that are not supported on CPU."""
decorated_func = skipCPUIf(True, "Not supported on CUDA")(test_func)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
decorated_func = skipCPUIf(True, "Not supported on CUDA")(test_func)
decorated_func = skipCPUIf(True, "Not supported on CPU")(test_func)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are alot of flaky test failures so will land in follow up, but goood catch

@pytorchmergebot
Copy link
Collaborator

Merge failed

Reason: Command git -C /home/runner/work/pytorch/pytorch rebase origin/main returned non-zero exit code 1

Rebasing (1/1)
Auto-merging test/inductor/test_flex_attention.py
CONFLICT (content): Merge conflict in test/inductor/test_flex_attention.py
error: could not apply 1e5673aa024... [FlexAttention] Fix device test instantation (#151846)
hint: Resolve all conflicts manually, mark them as resolved with
hint: "git add/rm <conflicted_files>", then run "git rebase --continue".
hint: You can instead skip this commit: run "git rebase --skip".
hint: To abort and get back to the state before "git rebase", run "git rebase --abort".
hint: Disable this message with "git config set advice.mergeConflict false"
Could not apply 1e5673aa024... [FlexAttention] Fix device test instantation (#151846)
Details for Dev Infra team Raised by workflow job

[ghstack-poisoned]
@drisspg
Copy link
Contributor Author
drisspg commented Apr 23, 2025

@pytorchbot merge -f "I ran the full CI everything was green and last minute merge conflict"

@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged immediately since you used the force (-f) flag, bypassing any CI checks (ETA: 1-5 minutes). Please use -f as last resort and instead consider -i/--ignore-current to continue the merge ignoring current failures. This will allow currently pending tests to finish and report signal before the merge.

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

@jithunnair-amd
Copy link
Collaborator
jithunnair-amd commented Apr 23, 2025

@drisspg Looks like this PR broke at least the following test in rocm workflow: inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_GQA_score_mod5_cuda_float16

https://hud.pytorch.org/hud/pytorch/pytorch/21b0ef520d651ed67f6978ac37c8a8a4093819ee/1?per_page=50&name_filter=rocm%20%2F&mergeEphemeralLF=true

I've added the ciflow/rocm label on this PR to surface the failure.

@pytorchbot revert -c nosignal -m "PR broke rocm workflow"

cc @huydhn

@pytorchmergebot
Copy link
Collaborator

@pytorchbot successfully started a revert job. Check the current status here.
Questions? Feedback? Please reach out to the PyTorch DevX Team

@pytorchmergebot
Copy link
Collaborator

Don't want to revert based on edited command

@jithunnair-amd
Copy link
Collaborator

Trying again, since the edit to the comment seemed to displease the bot:

@drisspg Looks like this PR broke at least the following test in rocm workflow: inductor/test_flex_attention.py::TestFlexAttentionCUDA::test_GQA_score_mod5_cuda_float16

https://hud.pytorch.org/hud/pytorch/pytorch/21b0ef520d651ed67f6978ac37c8a8a4093819ee/1?per_page=50&name_filter=rocm%20%2F&mergeEphemeralLF=true

I've added the ciflow/rocm label on this PR to surface the failure.

@pytorchbot revert -c nosignal -m "PR broke rocm workflow"

cc @huydhn

@jithunnair-amd jithunnair-amd added the ciflow/rocm Trigger "default" config CI on ROCm label Apr 23, 2025
@pytorchmergebot
Copy link
Collaborator

@pytorchbot successfully started a revert job. Check the current status here.
Questions? Feedback? Please reach out to the PyTorch DevX Team

pytorchmergebot added a commit that referenced this pull request Apr 23, 2025
This reverts commit b37fa20.

Reverted #151846 on behalf of https://github.com/jithunnair-amd due to PR broke rocm workflow ([comment](#151846 (comment)))
@pytorchmergebot
Copy link
Collaborator

@drisspg your PR has been successfully reverted.

@pytorchmergebot pytorchmergebot added Reverted ci-no-td Do not run TD on this PR labels Apr 23, 2025
@drisspg
Copy link
Contributor Author
drisspg commented Apr 23, 2025

@jithunnair-amd what is the failure?

[ghstack-poisoned]
@pytorchmergebot
Copy link
Collaborator

Starting merge as part of PR stack under #151959

pytorchmergebot pushed a commit that referenced this pull request Apr 24, 2025
wangkuiyi pushed a commit to wangkuiyi/pytorch that referenced this pull request Apr 25, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ci-no-td Do not run TD on this PR ciflow/inductor ciflow/rocm Trigger "default" config CI on ROCm ciflow/trunk Trigger trunk jobs on your pull request Merged module: inductor Reverted topic: not user facing topic category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants
0