8000 Enable AArch64 CI scripts to be used for local dev by jondea · Pull Request #143190 · pytorch/pytorch · GitHub
[go: up one dir, main page]

Skip to content

Enable AArch64 CI scripts to be used for local dev #143190

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

jondea
Copy link
Contributor
@jondea jondea commented Dec 13, 2024
  • Allow user to specify custom ComputeLibrary directory, which is then built rather than checking out a clean copy
  • Remove setup.py clean in build. The CI environment should be clean already, removing this enables incremental rebuilds
  • Use all cores for building ComputeLibrary

Mostly a port of pytorch/builder#2028 with the conda part removed, because aarch64_ci_setup.sh has changed and can now handle being called twice.

cc @jgong5 @mingfeima @XiaobingSuper @sanchitintel @ashokei @jingxu10 @jerryzh168 @malfet @snadampal @milpuz01 @aditew01 @nikhil-arm @fadara01

@jondea jondea requested a review from a team as a code owner December 13, 2024 11:49
Copy link
pytorch-bot bot commented Dec 13, 2024

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/143190

Note: Links to docs will display an error until the docs builds have been completed.

❌ 1 New Failure, 1 Unrelated Failure

As of commit 82e4c50 with merge base 7482eb2 (image):

NEW FAILURE - The following job has failed:

FLAKY - The following job failed but was likely due to flakiness present on trunk:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@soulitzer soulitzer added the triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module label Dec 17, 2024
@jondea jondea force-pushed the aarch64-enable-local-dev-using-ci-scripts branch from bdb74e2 to f424c67 Compare January 3, 2025 10:44
@jondea
Copy link
Contributor Author
jondea commented Jan 3, 2025

@pytorchbot label "topic: not user facing"

@pytorch-bot pytorch-bot bot added the topic: not user facing topic category label Jan 3, 2025
@jondea jondea force-pushed the aarch64-enable-local-dev-using-ci-scripts branch from f424c67 to afded46 Compare February 25, 2025 14:41
@jondea
Copy link
Contributor Author
jondea commented Feb 25, 2025

Can I get a review on this please?

@annop-w
Copy link
Contributor
annop-w commented Feb 28, 2025

@pytorchbot label "module: arm"

@pytorch-bot pytorch-bot bot added the module: arm Related to ARM architectures builds of PyTorch. Includes Apple M1 label Feb 28, 2025
Copy link
Contributor

Looks like this PR hasn't been updated in a while so we're going to go ahead and mark this as Stale.
Feel free to remove the Stale label if you feel this was a mistake.
If you are unable to remove the Stale label please contact a maintainer in order to do so.
If you want the bot to never mark this PR stale again, add the no-stale label.
Stale pull requests will automatically be closed after 30 days of inactivity.

@github-actions github-actions bot added the Stale label Apr 29, 2025
@fadara01
Copy link
Collaborator

@pytorchbot label "ciflow/linux-aarch64"

@pytorch-bot pytorch-bot bot added the ciflow/linux-aarch64 linux aarch64 CI workflow label Apr 30, 2025
Copy link
pytorch-bot bot commented Apr 30, 2025

To add the ciflow label ciflow/linux-aarch64 please first approve the workflows that are awaiting approval (scroll to the bottom of this page).

This helps ensure we don't trigger CI on this PR until it is actually authorized to do so. Please ping one of the reviewers if you do not have access to approve and run workflows.

@pytorch-bot pytorch-bot bot removed the ciflow/linux-aarch64 linux aarch64 CI workflow label Apr 30, 2025
@fadara01 fadara01 added the module: cpu CPU specific problem (e.g., perf, algorithm) label Apr 30, 2025
@fadara01
Copy link
Collaborator

@malfet - could you please approve the CI? We don't have enough permissions the approve it.

@aditew01
Copy link
Collaborator

@pytorchbot rebase

@pytorchmergebot
Copy link
Collaborator

@pytorchbot started a rebase job onto refs/remotes/origin/viable/strict. Check the current status here

@pytorchmergebot
Copy link
Collaborator

Successfully rebased aarch64-enable-local-dev-using-ci-scripts onto refs/remotes/origin/viable/strict, please pull locally before adding more changes (for example, via git checkout aarch64-enable-local-dev-using-ci-scripts && git pull --rebase)

@pytorchmergebot pytorchmergebot force-pushed the aarch64-enable-local-dev-using-ci-scripts branch from afded46 to 6e5628b Compare April 30, 2025 10:26
@fadara01
Copy link
Collaborator

@pytorchbot label "ciflow/linux-aarch64"

@pytorch-bot pytorch-bot bot added the ciflow/linux-aarch64 linux aarch64 CI workflow label Apr 30, 2025
@fadara01
Copy link
Collaborator

Oh I now see the "Enable CI Workflow" button after the rebase, thanks @aditew01!

Copy link
Collaborator
@fadara01 fadara01 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

< 8000 /div>

Generally LGTM, thank you!
This should improve dev workflow for devs trying to perfectly mirror pytorch's manylinux builds, without changing behavior in CI.

... just the minor question about the need for ACL_SOURCE_DIR

"--shallow-submodules",
]
)
acl_checkout_dir = os.getenv("ACL_SOURCE_DIR", "ComputeLibrary")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see why the ACL_SOURCE_DIR is needed here.
This script is meant to be run inside the docker container and one can make it pick up their custom local ACL directory by mounting the docker container to it (along with the other changes introduced in this PR).

For development outside docker, isn't ACL_ROOT_DIR in setup.py here enough?

@malfet
Copy link
Contributor
malfet commented May 15, 2025

@pytorchbot rebase

@pytorchmergebot
Copy link
Collaborator

@pytorchbot started a rebase job onto refs/remotes/origin/viable/strict. Check the current status here

- Allow user to specify a custom Arm Compute Library directory with the
  ACL_SOURCE_DIR environment variable, which is then built rather than
  checking out a clean copy
- Remove `setup.py clean` in build. The CI environment should be clean
  already, removing this enables incremental rebuilds
- Use all cores for building ComputeLibrary
- Remove restriction of building with MAX_JOBS=5 on CPU backend

Mostly a port of pytorch/builder#2028 with the
conda part removed, because aarch64_ci_setup.sh has changed and can now
handle being called twice.

Co-authored-by: David Svantesson-Yeung <David.Svantesson-Yeung@arm.com>
Co-authored-by: Fadi Arafeh <Fadi.Arafeh@arm.com>
@pytorchmergebot
Copy link
Collaborator

Successfully rebased aarch64-enable-local-dev-using-ci-scripts onto refs/remotes/origin/viable/strict, please pull locally before adding more changes (for example, via git checkout aarch64-enable-local-dev-using-ci-scripts && git pull --rebase)

@pytorchmergebot pytorchmergebot force-pushed the aarch64-enable-local-dev-using-ci-scripts branch from 6e5628b to 82e4c50 Compare May 15, 2025 17:23
@pytorch-bot pytorch-bot bot removed the ciflow/linux-aarch64 linux aarch64 CI workflow label May 15, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
module: arm Related to ARM architectures builds of PyTorch. Includes Apple M1 module: cpu CPU specific problem (e.g., perf, algorithm) open source Stale topic: not user facing topic category triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants
0