8000 [ARM] Enable some additional Aarch64 unit tests by robert-hardwick · Pull Request #146895 · pytorch/pytorch · GitHub
[go: up one dir, main page]

Skip to content

[ARM] Enable some additional Aarch64 unit tests #146895

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed

Conversation

robert-hardwick
Copy link
Collaborator
@robert-hardwick robert-hardwick commented Feb 11, 2025

This PR adds some tests to Aarch64 ci. Notably nn/test_convolution and inductor/test_fused_attention

The reason for this is that there are some additional regression test failures in the oneDNN 3.7 upgrade #138889 which do not have visibility because they are not enabled.

I have marked test_ConvTranspose2d_output_size_downsample_upsample as skipped ( #146857 ) due to a segmentation fault. But priority is to get visibility on oneDNN 3.7 new test failures.

cc @malfet @snadampal @milpuz01 @aditew01 @nikhil-arm @fadara01 @voznesenskym @penguinwu @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @ipiszy @chenyang78 @kadeng @muchulee8 @amjames @chauhang @aakhundov @yf225

@robert-hardwick robert-hardwick requested a review from a team as a code owner February 11, 2025 13:36
Copy link
pytorch-bot bot commented Feb 11, 2025

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/146895

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 04822a5 with merge base b004228 (image):
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@robert-hardwick
Copy link
Collaborator Author

@pytorchbot label "topic: not user facing"

@pytorch-bot pytorch-bot bot added the topic: not user facing topic category label Feb 11, 2025
@zou3519 zou3519 added the triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module label Feb 11, 2025
@robert-hardwick
Copy link
Collaborator Author
robert-hardwick commented Feb 11, 2025

@malfet this PR enables nn/test_convolution , inductor/test_fused_attention and inductor/test_cpu_select_algorithm. I have added skip or xfail to 3 tests which have been failing for some time ( i.e. not regressions ), with issues raised for each one.

We urgently need this PR in order to expose some important unit test regressions with oneDNN 3.7 upgrade #138889

@robert-hardwick robert-hardwick changed the title Enable some additional Aarch64 unit tests [ARM] Enable some additional Aarch64 unit tests Feb 11, 2025
@robert-hardwick
Copy link
Collaborator Author

@pytorchbot label "module: arm"

@pytorch-bot pytorch-bot bot added the module: arm Related to ARM architectures builds of PyTorch. Includes Apple M1 label Feb 11, 2025
Adds inductor/test_fused_attention inductor/test_cpu_select_algorithm
and nn/test_convolution. Skip/Xfail some tests with issues linked..
@robert-hardwick
Copy link
Collaborator Author

I have forced pushed in order to fix the merge conflict with a rebase

@robert-hardwick
Copy link
Collaborator Author

@yanbing-j can you approve again. Not sure if my force push cancelled the CI approval.

@fadara01
Copy link
Collaborator

@pytorchbot label "ciflow/linux-aarch64"

@pytorch-bot pytorch-bot bot added the ciflow/linux-aarch64 linux aarch64 CI workflow label Feb 12, 2025
Copy link
pytorch-bot bot commented Feb 12, 2025

To add the ciflow label ciflow/linux-aarch64 please first approve the workflows that are awaiting approval (scroll to the bottom of this page).

This helps ensure we don't trigger CI on this PR until it is actually authorized to do so. Please ping one of the reviewers if you do not have access to approve and run workflows.

@pytorch-bot pytorch-bot bot removed the ciflow/linux-aarch64 linux aarch64 CI workflow label Feb 12, 2025
@fadara01
Copy link
Collaborator

@pytorchbot label "ciflow/linux-aarch64"

@pytorch-bot pytorch-bot bot added the ciflow/linux-aarch64 linux aarch64 CI workflow label Feb 12, 2025
@robert-hardwick
Copy link
Collaborator Author
robert-hardwick commented Feb 12, 2025

There's still a failing test in inductor/test_cpu_select_algorithm.py ( looks like it fails on neoverse-n1 only ).

It could be a simple fix, if it's not obvious I will mark the test as skip and raise an issue so we don't hold up this PR too long

torch._inductor.exc.InductorError: LoweringException: RuntimeError: self and mat2 must have the same dtype, but got Float and BFloat16

@robert-hardwick
Copy link
Collaborator Author

@fadara01 @yanbing-j could one of you approve the workflow again? Thanks

@fadara01
Copy link
Collaborator
fadara01 commented Feb 13, 2025

@fadara01 @yanbing-j could one of you approve the workflow again? Thanks

Done!

@robert-hardwick
Copy link
Collaborator Author

@pytorchbot merge

Copy link
pytorch-bot bot commented Feb 13, 2025

This PR needs to be approved by an authorized maintainer before merge.

@robert-hardwick
Copy link
Collaborator Author

@malfet approval needed on this when you get a moment

@robert-hardwick
Copy link
Collaborator Author

@pytorchbot label "arm priority"

@fadara01
Copy link
Collaborator
fadara01 commented Feb 15, 2025

Let's hold back on merging this, since the Arm Compute Library (ACL) version used for the CI jammy docker image here is v24.04 while the one used in manylinux here is v24.09.
This means that we can't fully trust failing nor passing tests in the CI run above.
I'll mark this as draft for now, let's revisit it once #138889 which unifies the ACL version in all the build scripts is merged.

@fadara01 fadara01 changed the title [ARM] Enable some additional Aarch64 unit tests [DO NOT MERGE] [ARM] Enable some additional Aarch64 unit tests Feb 15, 2025
@nikhil-arm
Copy link
Collaborator

Hello @robert-hardwick , I believe we can avoid merging this for now. This needs to be addressed internally first. I will mark the PR as draft for now.

@nikhil-arm nikhil-arm marked this pull request as draft February 15, 2025 12:29
@nikhil-arm nikhil-arm self-assigned this Feb 15, 2025
@nikhil-arm nikhil-arm changed the title [DO NOT MERGE] [ARM] Enable some additional Aarch64 unit tests [ARM] Enable some additional Aarch64 unit tests Feb 15, 2025
@robert-hardwick
Copy link
Collaborator Author

Hello @robert-hardwick , I believe we can avoid merging this for now. This needs to be addressed internally first. I will mark the PR as draft for now.

Agreed. Let's get the ACL versions in the CI synchronised first.. and since #144992 has been reverted this is a lower priority now.

Thanks @nikhil-arm

Copy link
Contributor

Looks like this PR hasn't been updated in a while so we're going to go ahead and mark this as Stale.
Feel free to remove the Stale label if you feel this was a mistake.
If you are unable to remove the Stale label please contact a maintainer in order to do so.
If you want the bot to never mark this PR stale again, add the no-stale label.
Stale pull requests will automatically be closed after 30 days of inactivity.

@github-actions github-actions bot added the Stale label Apr 18, 2025
@github-actions github-actions bot closed this May 18, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ciflow/inductor ciflow/linux-aarch64 linux aarch64 CI workflow module: arm Related to ARM architectures builds of PyTorch. Includes Apple M1 module: inductor open source Stale topic: not user facing topic category triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants
0