-
Notifications
You must be signed in to change notification settings - Fork 24.3k
[AOTI] Fix #140546 and support AOTI package load for Intel GPU. #140664
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/140664
Note: Links to docs will display an error until the docs builds have been completed. ❗ 1 Active SEVsThere are 1 currently active SEVs. If your PR is affected, please view them below: ❌ 1 New FailureAs of commit 2abadda with merge base f84e533 ( NEW FAILURE - The following job has failed:
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
const std::string& model_so_path, | ||
size_t num_models, | ||
const std::string& device_str, | ||
const std::string& cubin_dir) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
const std::string& cubin_dir) { | |
const std::string& kernel_bin_dir) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
resolved
|
||
std::vector<at::Tensor> run( | ||
virtual std::vector<at::Tensor> run(const std::vector<at::Tensor>& inputs); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It changes the API definition. According to the issue description, the null
stream means the current stream for CUDA. The same semantic should also be applicable for Intel GPU. Since this PR has fixed the virtual function routine, the runner instance could dispatch run
function to the right device container runner.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This relates to our definition of the run API. At first I thought it would be more generic to define it as std::vector<at::Tensor> run(const std::vector<at::Tensor>& inputs, AOTInductorStreamHandle stream_handle = nullptr);
. But this means that in model_container_runner_xpu.h
, model_container_runner_xpu.h
, model_container_runner_cpu.h
, we need to define this override API as well. Since AOTIModelContainerRunnerCpu/Xpu/Cuda
will be used in pybind, we'll get compilation error:
torch/csrc/inductor/aoti_runner/pybind.cpp:21:11: required from here
/usr/include/c++/11/type_traits:1422:38: error: invalid use of incomplete type ‘struct AOTInductorStreamOpaque’
1422 | : public integral_constant<bool, __is_base_of(_Base, _Derived)>
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In file included from /home/xinanlin/xinanlin/pytorch/torch/csrc/inductor/aoti_runner/model_container_runner.h:5,
from torch/csrc/inductor/aoti_runner/model_container_runner_cpu.h:4,
from torch/csrc/inductor/aoti_runner/pybind.cpp:1:
torch/csrc/inductor/aoti_runtime/interface.h:16:8: note: forward declaration of ‘struct AOTInductorStreamOpaque’
The current design of this PR defines the interface of cpu/xpu/cuda container runner as std::vector<at::Tensor> run(const std::vector<at::Tensor>& inputs);
is just keep the same before PR, so I think there is no issue to land it.
And if we prefer the interface run
as std::vector<at::Tensor> run(const std::vector<at::Tensor>& inputs, AOTInductorStreamHandle stream_handle = nullptr);
, we can create an adapter between pybind interface and the AOTI runner container for the container to resolve the above compilation issue in a new PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree with @EikanWang , this is an API breaking change. I think if you declare stream_handle
as void*
, it should solve your compilation issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi, @desertfire I've updated the PR, could you please have a review? BTW could you please also review this series of PRs? They are all for AOTi XPU and are ready for review.
|
||
std::vector<at::Tensor> run( | ||
virtual std::vector<at::Tensor> run(const std::vector<at::Tensor>& inputs); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree with @EikanWang , this is an API breaking change. I think if you declare stream_handle
as void*
, it should solve your compilation issue.
…U. (#140664)" This reverts commit 91d3054. Reverted #140664 on behalf of https://github.com/clee2000 due to breaks forward compatibility? D66937097 ([comment](#140269 (comment)))
… GPU." Stack from [ghstack](https://github.com/ezyang/ghstack) (oldest at bottom): * #140686 * __->__ #140664 * #140269 * #140268 * #135320 * #135318 * #139026 Fix #140546 cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx ipiszy yf225 chenyang78 kadeng muchulee8 ColinPeppler amjames desertfire chauhang aakhundov [ghstack-poisoned]
…e load for Intel GPU." Stack from [ghstack](https://github.com/ezyang/ghstack) (oldest at bottom): * #140686 * __->__ #140664 * #140269 * #140268 * #135320 * #135318 * #139026 Fix #140546 cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx ipiszy yf225 chenyang78 kadeng muchulee8 ColinPeppler amjames desertfire chauhang aakhundov [ghstack-poisoned]
Relanding, see #140269 (comment) |
@pytorchbot merge |
Merge startedYour change will be merged once all checks pass (ETA 0-4 Hours). Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
Merge failedReason: 2 mandatory check(s) failed. The first few are:
Dig deeper by viewing the failures on hud |
… GPU." Stack from [ghstack](https://github.com/ezyang/ghstack) (oldest at bottom): * #140686 * __->__ #140664 * #140269 * #140268 * #135320 * #135318 * #139026 Fix #140546 cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx ipiszy yf225 chenyang78 kadeng muchulee8 ColinPeppler amjames desertfire chauhang aakhundov [ghstack-poisoned]
…e load for Intel GPU." Stack from [ghstack](https://github.com/ezyang/ghstack) (oldest at bottom): * #140686 * __->__ #140664 * #140269 * #140268 * #135320 * #135318 * #139026 Fix #140546 cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx ipiszy yf225 chenyang78 kadeng muchulee8 ColinPeppler amjames desertfire chauhang aakhundov [ghstack-poisoned]
The failed job: xpu / linux-jammy-xpu-2025.0-py3.9 / test (default, 4, 4, linux.idc.xpu) (gh) is a known issue #141861 and has been fixed in main branch by #142348. Please ignore it. @desertfire can we land this PR after CI finished? |
SGTM |
@pytorchbot merge -i |
Merge startedYour change will be merged while ignoring the following 1 checks: xpu / linux-jammy-xpu-2025.0-py3.9 / test (default, 4, 4, linux.idc.xpu) Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
pytorch#140664) Stack from [ghstack](https://github.com/ezyang/ghstack) (oldest at bottom): * pytorch#140686 * __->__ pytorch#140664 * pytorch#140269 * pytorch#140268 * pytorch#135320 * pytorch#135318 * pytorch#139026 Fix pytorch#140546 Pull Request resolved: pytorch#140664 Approved by: https://github.com/desertfire, https://github.com/EikanWang ghstack dependencies: pytorch#140268, pytorch#140269 Co-authored-by: Bin Bao <binbao@meta.com>
…ntel GPU. (pytorch#140664)" This reverts commit 91d3054. Reverted pytorch#140664 on behalf of https://github.com/clee2000 due to breaks forward compatibility? D66937097 ([comment](pytorch#140269 (comment)))
pytorch#140664) Stack from [ghstack](https://github.com/ezyang/ghstack) (oldest at bottom): * pytorch#140686 * __->__ pytorch#140664 * pytorch#140269 * pytorch#140268 * pytorch#135320 * pytorch#135318 * pytorch#139026 Fix pytorch#140546 Pull Request resolved: pytorch#140664 Approved by: https://github.com/desertfire, https://github.com/EikanWang ghstack dependencies: pytorch#140268, pytorch#140269 Co-authored-by: Bin Bao <binbao@meta.com>
ghstack-source-id: c702bf7 Pull Request resolved: pytorch/pytorch#140664
Stack from ghstack (oldest at bottom):
Fix #140546
cc @voznesenskym @penguinwu @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @ipiszy @yf225 @chenyang78 @kadeng @muchulee8 @ColinPeppler @amjames @desertfire @chauhang @aakhundov