Enable the AMP precision with freezing for CPU nightly test #152298

LifengWang · 2025-04-28T03:28:59Z

Hi, @desertfire. Since we recommend users to use AMP precision and run with --freezing for CPU x86 Inductor inference, we suggest adding the AMP freezing test to the CPU nightly tests.

cc @chuanqi129 @zxd1997066

pytorch-bot · 2025-04-28T03:29:02Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/152298

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki or our office hours

Note: Links to docs will display an error until the docs builds have been completed.

✅ You can merge normally! (2 Unrelated Failures)

As of commit a943926 with merge base 190f76f ():

FLAKY - The following jobs failed but were likely due to flakiness present on trunk:

pull / linux-jammy-py3.13-clang12 / test (dynamo_wrapped, 3, 3, linux.2xlarge) (gh) (disabled by #153644)
test_autograd.py::TestAutogradFunctional::test_hessian_vectorize_raises_no_warnings_logging_tensor
pull / linux-jammy-py3.9-clang12 / test (dynamo_wrapped, 3, 3, linux.2xlarge) (gh) (disabled by #153644)
test_autograd.py::TestAutogradFunctional::test_hessian_vectorize_raises_no_warnings_logging_tensor

This comment was automatically generated by Dr. CI and updates every 15 minutes.

LifengWang · 2025-05-13T08:27:58Z

@pytorchbot rebase

pytorchmergebot · 2025-05-13T08:29:41Z

@pytorchbot started a rebase job onto refs/remotes/origin/viable/strict. Check the current status here

pytorchmergebot · 2025-05-13T08:29:46Z

Successfully rebased freezing_nightly onto refs/remotes/origin/viable/strict, please pull locally before adding more changes (for example, via git checkout freezing_nightly && git pull --rebase)

LifengWang · 2025-05-20T02:58:35Z

@pytorchbot rebase

pytorchmergebot · 2025-05-20T03:00:13Z

@pytorchbot started a rebase job onto refs/remotes/origin/viable/strict. Check the current status here

pytorchmergebot · 2025-05-20T03:00:17Z

Successfully rebased freezing_nightly onto refs/remotes/origin/viable/strict, please pull locally before adding more changes (for example, via git checkout freezing_nightly && git pull --rebase)

.ci/pytorch/test.sh

.github/workflows/inductor-perf-test-nightly-x86.yml

LifengWang · 2025-05-28T01:40:57Z

Hi, @desertfire. I enhanced this PR according to your comments. Could you please help review it?

desertfire · 2025-05-29T12:33:16Z

.github/workflows/inductor-perf-test-nightly-x86.yml

+    name: linux-jammy-cpu-py3.9-gcc11-inductor
+    uses: ./.github/workflows/_linux-test.yml
+    needs: linux-jammy-cpu-py3_9-gcc11-inductor-build
+    if: github.event.schedule == '0 7 * * *' || github.event_name == 'pull_request'


This means we will test both freezing ON and OFF every night. Is this really we want here?

Yes, I remember that you mentioned that we need to keep the default test. Maybe we can just run a few tests with freezing off?

Did you trigger a dashboard run? Do you have a dashboard link shows how the data is going to be displayed?

Hi @desertfire. I checked the dashboard page for the performance data related to freezing using the following link . Set the time range to the last 14 days, precision to AMP, and device to CPU (x86).
It appears that the freezing test configuration is already included in the dashboard, but no performance data is being displayed.

OK, I see two problems here:

The landing page of the OSS dashboard defaults to bf16 for inference. If a user selects x86 as the inference backend, they will see an empty page thinking the system is down or something.

If you think freezing should be default, we should avoid testing the non-freezing one to cut the hardware expense. Also the names with freezing on is too long which should be fixed.

UI fixes need to be done in https://github.com/pytorch/test-infra.

Hi, @desertfire. Now I have removed the non-freezing test, and the test results in the HUD are as follows. Would it make sense to update the CI tests in this PR first, and address the dashboard changes in a follow-up？

Hi, @huydhn, do you have any insights for the dashboard update? Since we want to update the CPU nightly test using the freezing model.

Sorry for not seeing your message earlier! It's ok to land this first and update the dashboard later

cc @yangw-dev

desertfire · 2025-05-29T12:35:09Z

.github/workflows/inductor-perf-test-nightly-x86.yml

@@ -1,6 +1,9 @@
 name: inductor-perf-nightly-x86

 on:
+  pull_request:
+    paths:
+      - .github/workflows/inductor-perf-test-nightly-x86.yml


Nit: not really necessary since the file will be updated occasionally, and you can always manually trigger a run when needed.

I see. I added this code so that the PR can trigger the nightly test. I will remove it later.

LifengWang · 2025-06-17T06:18:34Z

@pytorchbot rebase

pytorchmergebot · 2025-06-17T06:20:03Z

@pytorchbot started a rebase job onto refs/remotes/origin/viable/strict. Check the current status here

…d tags

pytorchmergebot · 2025-06-17T06:20:07Z

Successfully rebased freezing_nightly onto refs/remotes/origin/viable/strict, please pull locally before adding more changes (for example, via git checkout freezing_nightly && git pull --rebase)

huydhn · 2025-06-27T17:59:13Z

.github/workflows/inductor-perf-test-nightly-x86.yml

    name: linux-jammy-cpu-py3.9-gcc11-inductor
    uses: ./.github/workflows/_linux-test.yml
    needs: linux-jammy-cpu-py3_9-gcc11-inductor-build
-    if: github.event.schedule == '0 7 * * *'
+    if: github.event.schedule == '0 7 * * *' || github.event_name == 'pull_request'


I assume that you will remove || github.event_name == 'pull_request' part before landing this?

Suggested change

if: github.event.schedule == '0 7 * * *' || github.event_name == 'pull_request'

if: github.event.schedule == '0 7 * * *'

I assume that you will remove || github.event_name == 'pull_request' part before landing this?

Sure. I have removed the PR trigger for this job.

huydhn

Please clean up the pull_request trigger before landing. I could help with the dashboard update to use amp by default for cpu benchmark

LifengWang · 2025-06-30T05:32:12Z

@pytorchmergebot merge

pytorchmergebot · 2025-06-30T05:34:14Z

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

8000

LifengWang requested a review from a team as a code owner April 28, 2025 03:28

pytorch-bot bot added the release notes: releng release notes category label Apr 28, 2025

pytorchbot added the open source label Apr 28, 2025

soulitzer added the triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module label Apr 28, 2025

pytorchmergebot force-pushed the freezing_nightly branch from 57c82f1 to 0c46eb2 Compare May 13, 2025 08:29

pytorchmergebot force-pushed the freezing_nightly branch from 6e2734b to 1f901d9 Compare May 20, 2025 03:00

desertfire requested changes May 22, 2025

View reviewed changes

.ci/pytorch/test.sh Outdated Show resolved Hide resolved

.github/workflows/inductor-perf-test-nightly-x86.yml Outdated Show resolved Hide resolved

.github/workflows/inductor-perf-test-nightly-x86.yml Outdated Show resolved Hide resolved

desertfire reviewed May 29, 2025

View reviewed changes

desertfire self-requested a review May 29, 2025 12:35

zxd1997066 mentioned this pull request Jun 9, 2025

[inductor][cpu]functorch_dp_cifar10 and opacus_cifar10 performance regression in 2025-05-24 nightly release #154598

Closed

zxd1997066 and others added 8 commits June 17, 2025 06:20

add freezing amp precision for cpu nightly

080c71a

support cppwrapper_freezing

82ff54a

Update pull request trigger for inductor performance nightly workflow

9dff805

Fix concurrency group reference in nightly workflow YAML

05daddf

Fix concurrency group formatting in nightly workflow YAML

0dd1c38

Update condition for nightly test job to include pull request events

89c3907

Add freezing option to inductor performance tests and update dashboar…

fa8819e

…d tags

Remove nightly test job for non-freezing models

3ed5831

pytorchmergebot force-pushed the freezing_nightly branch from 4a995dc to 3ed5831 Compare June 17, 2025 06:20

desertfire approved these changes Jun 27, 2025

View reviewed changes

huydhn reviewed Jun 27, 2025

View reviewed changes

huydhn approved these changes Jun 27, 2025

View reviewed changes

Remove pull request trigger

a943926

pytorch-bot bot added the ciflow/trunk Trigger trunk jobs on your pull request label Jun 30, 2025

pytorchmergebot added the merging label Jun 30, 2025

pytorchmergebot closed this in ccb67f3 Jun 30, 2025

pytorchmergebot added Merged and removed merging labels Jun 30, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Enable the AMP precision with freezing for CPU nightly test #152298

Enable the AMP precision with freezing for CPU nightly test #152298

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

	if: github.event.schedule == '0 7 * * *' \|\| github.event_name == 'pull_request'
	if: github.event.schedule == '0 7 * * *'

Enable the AMP precision with freezing for CPU nightly test #152298

Enable the AMP precision with freezing for CPU nightly test #152298

Uh oh!

Conversation

Uh oh!

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/152298

✅ You can merge normally! (2 Unrelated Failures)

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Merge started

Uh oh!

Uh oh!