[Intel GPU][quant] Refine zero-point memory creation #148640

ZhiweiYan-96 · 2025-03-06T05:55:34Z

Motivation

This PR skips zero-point GPU memory creation when zero-point=0, as it would not be used by oneDNN library. This could help save the 1~3 H2D copy overhead per QLinear/QConv kernel.

Stack from ghstack (oldest at bottom):

-> [Intel GPU][quant] Refine zero-point memory creation #148640

cc @jgong5 @mingfeima @XiaobingSuper @sanchitintel @ashokei @jingxu10

[ghstack-poisoned]

pytorch-bot · 2025-03-06T05:55:38Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/148640

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki or our office hours

Note: Links to docs will display an error until the docs builds have been completed.

✅ You can merge normally! (5 Unrelated Failures)

As of commit 25763ba with merge base e2a0296 ():

FLAKY - The following jobs failed but were likely due to flakiness present on trunk:

xpu / linux-jammy-xpu-2025.0-py3.9 / test (default, 1, 4, linux.idc.xpu) (gh) (detected as infra flaky with no log or failing log classifier)
xpu / linux-jammy-xpu-2025.0-py3.9 / test (default, 3, 4, linux.idc.xpu) (gh) (similar failure)
Build left local git repository checkout dirty
xpu / linux-jammy-xpu-2025.0-py3.9 / test (default, 4, 4, linux.idc.xpu) (gh) (similar failure)
Build left local git repository checkout dirty

BROKEN TRUNK - The following job failed but were present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

inductor / cuda12.4-py3.10-gcc9-sm86 / test (inductor_timm, 2, 2, linux.g5.4xlarge.nvidia.gpu) (gh) (trunk failure)
mobilenetv3_large_100

UNSTABLE - The following job is marked as unstable, possibly due to flakiness on trunk:

trunk / libtorch-linux-focal-cuda12.4-py3.10-gcc9-debug / build (gh) (#148495)
undefined reference to std::__throw_bad_array_new_length()'`

This comment was automatically generated by Dr. CI and updates every 15 minutes.

linux-foundation-easycla · 2025-03-06T05:55:38Z

The committers listed above are authorized under a signed CLA.

✅ login: ZhiweiYan-96 / name: Zhiwei (9f3d4de, 25763ba, f6a621b, 42374fb)

ghstack-source-id: 9d574ce Pull Request resolved: #148640

[ghstack-poisoned]

ghstack-source-id: c74f597 Pull Request resolved: #148640

[ghstack-poisoned]

ghstack-source-id: 546e490 Pull Request resolved: #148640

ZhiweiYan-96 · 2025-03-07T05:30:43Z

@pytorchbot rebase

pytorchmergebot · 2025-03-07T05:32:09Z

@pytorchbot started a rebase job onto refs/remotes/origin/viable/strict. Check the current status here

[ghstack-poisoned]

pytorchmergebot · 2025-03-07T05:32:20Z

Successfully rebased gh/ZhiweiYan-96/53/orig onto refs/remotes/origin/viable/strict, please pull locally before adding more changes (for example, via ghstack checkout https://github.com/pytorch/pytorch/pull/148640)

ghstack-source-id: b7d4f40 Pull Request resolved: #148640

EikanWang · 2025-03-07T07:57:57Z

@pytorchbot merge

pytorchmergebot · 2025-03-07T07:59:37Z

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

ZhiweiYan-96 · 2025-03-07T13:19:38Z

@pytorchbot merge

pytorchmergebot · 2025-03-07T13:19:57Z

The merge job was canceled or timed out. This most often happen if two merge requests were issued for the same PR, or if merge job was waiting for more than 6 hours for tests to finish. In later case, please do not hesitate to reissue the merge command
For more information see pytorch-bot wiki.

pytorchmergebot · 2025-03-07T13:21:38Z

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

Update

42374fb

[ghstack-poisoned]

ZhiweiYan-96 requested review from EikanWang and gujinghui as code owners March 6, 2025 05:55

pytorch-bot bot added the module: cpu CPU specific problem (e.g., perf, algorithm) label Mar 6, 2025

ZhiweiYan-96 added a commit that referenced this pull request Mar 6, 2025

[Intel GPU][quant] Refine zero-point memory creation

f64ede5

ghstack-source-id: 9d574ce Pull Request resolved: #148640

liangan1 approved these changes Mar 6, 2025

View reviewed changes

ZhiweiYan-96 added ciflow/xpu Run XPU CI tasks ciflow/trunk Trigger trunk jobs on your pull request ciflow/inductor labels Mar 6, 2025

pytorchbot added the open source label Mar 6, 2025

ZhiweiYan-96 marked this pull request as draft March 6, 2025 06:08

ZhiweiYan-96 added this to PyTorch Intel Mar 6, 2025

ZhiweiYan-96 moved this to Pre-Review Required in PyTorch Intel Mar 6, 2025

ZhiweiYan-96 added this to the 2.7.0 milestone Mar 6, 2025

Update

f6a621b

[ghstack-poisoned]

ZhiweiYan-96 added a commit that referenced this pull request Mar 6, 2025

[Intel GPU][quant] Refine zero-point memory creation

2f64fca

ghstack-source-id: c74f597 Pull Request resolved: #148640

ZhiweiYan-96 added the topic: not user facing topic category label Mar 6, 2025

Update

9f3d4de

[ghstack-poisoned]

ZhiweiYan-96 added a commit that referenced this pull request Mar 6, 2025

[Intel GPU][quant] Refine zero-point memory creation

50ac884

ghstack-source-id: 546e490 Pull Request resolved: #148640

Update

25763ba

[ghstack-poisoned]

pytorchmergebot pushed a commit that referenced this pull request Mar 7, 2025

[Intel GPU][quant] Refine zero-point memory creation

85e954c

ghstack-source-id: b7d4f40 Pull Request resolved: #148640

EikanWang approved these changes Mar 7, 2025

View reviewed changes

EikanWang marked this pull request as ready for review March 7, 2025 07:57

EikanWang added the keep-going Don't stop on first failure, keep running tests until the end label Mar 7, 2025

pytorchmergebot added the merging label Mar 7, 2025

pytorchmergebot closed this in 81847d0 Mar 7, 2025

pytorchmergebot added the Merged label Mar 7, 2025

github-project-automation bot moved this from Pre-Review Required to Done in PyTorch Intel Mar 7, 2025

pytorchmergebot removed the merging label Mar 7, 2025

atalman mentioned this pull request Apr 3, 2025

Release 2.7.0 validations checklist and cherry-picks #150628

Closed

65 tasks

github-actions bot deleted the gh/ZhiweiYan-96/53/head branch April 12, 2025 02:10

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Intel GPU][quant] Refine zero-point memory creation #148640

[Intel GPU][quant] Refine zero-point memory creation #148640

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

[Intel GPU][quant] Refine zero-point memory creation #148640

[Intel GPU][quant] Refine zero-point memory creation #148640

Uh oh!

Conversation

Uh oh!

Motivation

Uh oh!

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/148640

✅ You can merge normally! (5 Unrelated Failures)

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Merge started

Uh oh!

Uh oh!

Uh oh!

Merge started

Uh oh!

Uh oh!