8000 [xpu] set aot device flags in cpp_extension by jingxu10 · Pull Request #149459 · pytorch/pytorch · GitHub
[go: up one dir, main page]

Skip to content

[xpu] set aot device flags in cpp_extension #149459

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
Closed
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
update
  • Loading branch information
jingxu10 authored and pytorchmergebot committed Apr 8, 2025
commit 406e539f5c80b90a0ad42d6aa9e4ea2ef21d2648
2 changes: 1 addition & 1 deletion torch/utils/cpp_extension.py
8000
Original file line number Diff line number Diff line change
Expand Up @@ -295,7 +295,7 @@ def _get_sycl_arch_list():
return []
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What the behavior of built extension will be if we return empty [] arch flags list?

  1. Will it actually be buildable?
  2. For which arch(s) it will be built?
  3. Or it won't be pre-built for any AOT target and running extension will result in runtime compilation?

Secondly, is '-fsycl-targets=spir64_gen,spir64' still needed to be passed here?

I think empty arch list worths a comment left in the source code here.

Copy link
Collaborator Author
@jingxu10 jingxu10 Mar 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Compilation will still work, but without aot compilation with ocloc. As you mentioned in3, it won't be pre-built for any AOT target and running extension will result in runtime compilation.
-fsycl-targets=spir64_gen,spir64 cannot be here, otherwise ocloc will crash complaining no targets are set.

else:
return ['-fsycl-targets=spir64_gen,spir64',
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moving these flags here does not seem to actually work correctly. With this change, the following 2 warnings appear. I suggest you better drop these flags from the PR and do that separately. if needed

# python -m pytest test/test_cpp_extensions_jit.py -k xpu
...
test/test_cpp_extensions_jit.py [1/4] c++ -MMD -MF main.o.d -DTORCH_EXTENSION_NAME=inline_jit_extension_xpu -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1018\" -isystem /home/dvrogozh/git/pytorch/pytorch/torch/include -isystem /home/dvrogozh/git/pytorch/pytorch/torch/include/torch/csrc/api/include -isystem /usr/include/python3.12 -D_GLIBCXX_USE_CXX11_ABI=1 -fPIC -std=c++17 -c /home/dvrogozh/.cache/torch_extensions/py312_cpu/inline_jit_extension_xpu/main.cpp -o main.o
[2/4] icpx -DTORCH_EXTENSION_NAME=inline_jit_extension_xpu -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1018\" -isystem /home/dvrogozh/git/pytorch/pytorch/torch/include -isystem /home/dvrogozh/git/pytorch/pytorch/torch/include/torch/csrc/api/include -isystem /usr/include/python3.12 -D_GLIBCXX_USE_CXX11_ABI=1 -fPIC -std=c++17 -fsycl -sycl-std=2020 -fsycl-host-compiler=c++ '-fsycl-host-compiler-options=-DTORCH_EXTENSION_NAME=inline_jit_extension_xpu -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\\"_gcc\\" -DPYBIND11_STDLIB=\\"_libstdcpp\\" -DPYBIND11_BUILD_ABI=\\"_cxxabi1018\\" -isystem /home/dvrogozh/git/pytorch/pytorch/torch/include -isystem /home/dvrogozh/git/pytorch/pytorch/torch/include/torch/csrc/api/include -isystem /usr/include/python3.12 -D_GLIBCXX_USE_CXX11_ABI=1 -fPIC -std=c++17' -c -x c++ /home/dvrogozh/.cache/torch_extensions/py312_cpu/inline_jit_extension_xpu/sycl.sycl -o sycl.sycl.o
[3/4] icpx main.o sycl.sycl.o -o sycl_dlink.o -fsycl -fsycl-link --offload-compress -fsycl-targets=spir64_gen,spir64 -flink-huge-device-code -Xs "-device pvc"
icpx: warning: linked binaries do not contain expected 'spir64_gen-unknown-unknown' target; found targets: 'spir64-unknown-unknown' [-Wsycl-target]
icpx: warning: argument unused during compilation: '-flink-huge-device-code' [-Wunused-command-line-argument]

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do these 2 warning messages appear with empty aot or non-empty aot?

Copy link
Collaborator Author
@jingxu10 jingxu10 Mar 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These 2 warning messages don't seem to make sense to me. If the arch_list is not an empty list, flags passed to compilation should be exactly the same as before without these changes. If the arch_list is an empty list, neither spir64 target or the -flink-huge-device-code will be set into flags.

Copy link
Contributor
@dvrogozh dvrogozh Mar 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These appear on pytorch initially built with TORCH_XPU_ARCH_LIST=pvc, i.e. on non-empty arch list. You have the pytest cmdline above to try it yourself: python -m pytest test/test_cpp_extensions_jit.py -k xpu. Please, make sure to find a root cause and remove the warning before this PR can be merged.

f'-Xs "-device {\',\'.join(arch_list)}"']
f'-Xs "-device {",".join(arch_list)}"']

_SYCL_DLINK_FLAGS = [
*_COMMON_SYCL_FLAGS,
Expand Down
0