8000 update xeon launch script for Intel(R) Xeon 6 support by jingxu10 · Pull Request #133835 · pytorch/pytorch · GitHub
[go: up one dir, main page]

Skip to content

Conversation

@jingxu10
8000 Copy link
Contributor
@jingxu10 jingxu10 commented Aug 19, 2024

Add support to Intel(R) Xeon 6 (GNR) and Core CPUs with E and P cores.

  1. Add a shortcut exec torch-xeon-launcher to python -m torch.backends.xeon.run_cpu. Both commands work.
  2. Simplified launcher script arguments. Keeps deprecated arguments working with deprecation warning messages for backward compatibility.
  3. Restructured CPU information collection methodology. Modularize CPU topology detection and instance-cores mapping algorithm to support GNR and Intel Core platforms' E and P cores. The new implementation is more robust.

  1. The BC Lint error is a false positive.
  2. Hi @albanD @seemethere, this PR contains a restructured implementation of the xeon launch script, thus cannot be split. The PR Size Check doesn't apply.

@pytorch-bot
Copy link
pytorch-bot bot commented Aug 19, 2024

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/133835

Note: Links to docs will display an error until the docs builds have been completed.

❌ 3 New Failures

As of commit ca03c05 with merge base a777dea (image):

NEW FAILURES - The following jobs have failed:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@jingxu10 jingxu10 force-pushed the jingxu10/xeon_launch branch from f693c87 to 1cdb6a2 Compare August 19, 2024 07:22
@jingxu10 jingxu10 marked this pull request as draft August 19, 2024 08:47
@jingxu10 jingxu10 force-pushed the jingxu10/xeon_launch branch 2 times, most recently from 6369d4f to 99de183 Compare August 19, 2024 10:27
@jingxu10 jingxu10 marked this pull request as ready for review August 19, 2024 10:28
@jingxu10 jingxu10 force-pushed the jingxu10/xeon_launch branch 3 times, most recently from 5bc6c70 to 48e382b Compare August 20, 2024 23:13
@soulitzer soulitzer requested review from albanD and seemethere August 21, 2024 00:03
@soulitzer soulitzer added the triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module label Aug 21, 2024
@jingxu10 jingxu10 force-pushed the jingxu10/xeon_launch branch 4 times, most recently from 0440dba to 51158c8 Compare August 22, 2024 00:09
@jingxu10
Copy link
Contributor Author

Hi @malfet @albanD @kiukchung Any suggestions how should we step forward? Thank you.

@albanD albanD requested review from malfet and removed request for albanD August 23, 2024 16:15
@jingxu10 jingxu10 changed the title update xeon launch script for GNR support update xeon launch script for Intel Xeon 6 support Oct 16, 2024
Comment on lines 146 to 148
if platform.system() == "Windows":
raise RuntimeError("Windows platform is not supported!!!")
elif platform.system() == "Linux":
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO this should be if platform.system() == "Linux" and platform.machine() == "x86_64": ... else: raise RuntimeError(f"Unsupported platform {platform.system()}") to cover both Windows, MacOS as well as LinuxARM

Comment on lines +150 to +154
if lscpu_txt.strip() == "":
args = ["lscpu", "--all", "--extended"]
my_env = os.environ.copy()
my_env["LC_ALL"] = "C"
lscpu_info = subprocess.check_output(
args, env=my_env, universal_newlines=True
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why this information can be queried from lscpu rather than using cpuinfo?

@jingxu10 jingxu10 force-pushed the jingxu10/xeon_launch branch from 3774dd2 to c807c1e Compare October 17, 2024 20:50
@github-actions
Copy link
Contributor

This PR needs a release notes: label

If your changes are user facing and intended to be a part of release notes, please use a label starting with release notes:.

If not, please add the topic: not user facing label.

To add a label, you can comment to pytorchbot, for example
@pytorchbot label "topic: not user facing"

For more information, see
https://github.com/pytorch/pytorch/wiki/PyTorch-AutoLabel-Bot#why-categorize-for-release-notes-and-how-does-it-work.

@jingxu10 jingxu10 force-pushed the jingxu10/xeon_launch branch from c807c1e to ca03c05 Compare October 17, 2024 20:54
@jingxu10 jingxu10 changed the title update xeon launch script for Intel Xeon 6 support update xeon launch script for Intel(R) Xeon 6 support Oct 17, 2024
Copy link
Contributor
@malfet malfet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My biggest challenge with this change, is that it lscpu --extended output is not really defined anywhere.

I.e. on my machine it produces following output

 lscpu --extend --all
CPU CLUSTER CORE L1d:L1i:L2 ONLINE
  0       0    0 0:0:0         yes
  1       0    1 1:1:0         yes
  2       0    2 2:2:0         yes
  3       0    3 3:3:0         yes
  4       0    4 4:4:0         yes
  5       0    5 5:5:0         yes
  6       0    6 6:6:0         yes
  7       0    7 7:7:0         yes
  8       0    8 8:8:0         yes
  9       0    9 9:9:0         yes

whereas lscpu --parse=CPU,Core,Socket,Node always produces results in the same manner

lscpu --extended --json seems to be a good middle ground

@github-actions
Copy link
Contributor

Looks like this PR hasn't been updated in a while so we're going to go ahead and mark this as Stale.
Feel free to remove the Stale label if you feel this was a mistake.
If you are unable to remove the Stale label please contact a maintainer in order to do so.
If you want the bot to never mark this PR stale again, add the no-stale label.
Stale pull requests will automatically be closed after 30 days of inactivity.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

open source Stale triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

4 participants

0