-
Notifications
You must be signed in to change notification settings - Fork 24.8k
xpu: support sycl with torch.utils.cpp_extension APIs #132945
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Closed
Closed
Changes from all commits
Commits
Show all changes
13 commits
Select commit
Hold shift + click to select a range
ed943d8
xpu: support sycl with torch.utils.cpp_extension.load
dvrogozh b924a4a
xpu: support sycl with torch.utils.cpp_extension.load_inline
dvrogozh 1545660
xpu: implement torch.utils.cpp_extension.SyclExtension
dvrogozh afc6eff
xpu: define sycl flags in global scope
dvrogozh 879466e
xpu: add function helper for sycl host flags
dvrogozh c9d2d59
xpu: add _append_sycl_std_if_no_std_present helper
dvrogozh ff4cc65
xpu: differentiate archs per OS for Extension API
dvrogozh efbda39
xpu: add compression for sycl kernels in Extension API
dvrogozh 5599561
xpu: use SYCL compiler instead of icpx in comments
dvrogozh 3828cf2
xpu: make flags private
dvrogozh 507936a
xpu: support py_limited_api with SyclExtension
dvrogozh ea23429
xpu: use torch.xpu.get_arch_list to query list of archs
dvrogozh c43bf06
Update torch/utils/cpp_extension.py
malfet File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,63 @@ | ||
#include <c10/xpu/XPUStream.h> | ||
#include <torch/extension.h> | ||
#include <sycl/sycl.hpp> | ||
|
||
void sigmoid_add_kernel(const float* x, | ||
const float* y, | ||
float* output, | ||
const int size, | ||
const sycl::nd_item<3> &item_ct1) { | ||
const int index = item_ct1.get_group(2) * item_ct1.get_local_range(2) + | ||
item_ct1.get_local_id(2); | ||
if (index < size) { | ||
const float sigmoid_x = 1.0f / (1.0f + sycl::native::exp(-x[index])); | ||
const float sigmoid_y = 1.0f / (1.0f + sycl::native::exp(-y[index])); | ||
output[index] = sigmoid_x + sigmoid_y; | ||
} | ||
} | ||
|
||
class SigmoidAddKernel { | ||
public: | ||
void operator()(const sycl::nd_item<3> &item_ct1) const { | ||
sigmoid_add_kernel(x, y, output, size, item_ct1); | ||
} | ||
SigmoidAddKernel(const float* _x, const float* _y, float* _output, int _size): | ||
x(_x), | ||
y(_y), | ||
output(_output), | ||
size(_size) | ||
{} | ||
private: | ||
const float* x; | ||
const float* y; | ||
float* output; | ||
int size; | ||
}; | ||
|
||
void sigmoid_add_xpu(const float* x, const float* y, float* output, int size) { | ||
SigmoidAddKernel krn(x, y, output, size); | ||
const int threads = 1024; | ||
const int blocks = (size + threads - 1) / threads; | ||
|
||
sycl::queue& queue = c10::xpu::getCurrentXPUStream().queue(); | ||
queue.submit([&](sycl::handler &cgh) { | ||
cgh.parallel_for<SigmoidAddKernel>( | ||
sycl::nd_range<3>( | ||
sycl::range<3>(1, 1, blocks) * sycl::range<3>(1, 1, threads), | ||
sycl::range<3>(1, 1, threads)), | ||
krn); | ||
}); | ||
} | ||
|
||
torch::Tensor sigmoid_add(torch::Tensor x, torch::Tensor y) { | ||
TORCH_CHECK(x.device().is_xpu(), "x must be a XPU tensor"); | ||
TORCH_CHECK(y.device().is_xpu(), "y must be a XPU tensor"); | ||
auto output = torch::zeros_like(x); | ||
sigmoid_add_xpu( | ||
x.data_ptr<float>(), y.data_ptr<float>(), output.data_ptr<float>(), output.numel()); | ||
return output; | ||
} | ||
|
||
PYBIND11_MODULE(TORCH_EXTENSION_NAME, m) { | ||
m.def("sigmoid_add", &sigmoid_add, "sigmoid(x) + sigmoid(y)"); | ||
} |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
.sycl
is not documented by SYCL spec and Intel SYCL compiler implementation. For now, I think it is not proper time to deliver the usage to community. We are following up the feature with compiler team. @EikanWang Please correct me. BTW, it is a good example to show compiler team.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually I used a documented feature to support files named with
.sycl
extension. Which is while this extension is not automatically recognized by the compiler, you can use-x <lang>
option to say what's the type of the file which is being compiled. I used-x c++ file.sycl
.I agree that we should follow up with dpc++ compiler asking for automated support of
.sycl
extension. I fill file issue for that tomorrow. But I believe we can proceed in a meanwhile with approach I described above.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Filed intel/llvm#15015 with request for
.sycl
extension.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We'd prefer to leave the flexibility to the SYCL compiler community to provide the solution. If SYCL compiler community decides to use file extension to support this case, it is the freedom of the SYCL compiler community to decide which the file extension for SYCL source files should be.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here is summary of discussions with our compiler team and compiler community. At the moment they oppose introducing
.sycl
file extension into the compiler. They also encourage to deal with SYCL/C++ compilation differences on build system level using build system agreed custom file extensions or other methods to logically separate sources. This discussion needs to happen here for PyTorch. Similar discussion is ongoing around SYCL support in cmake.Overall for the PyTorch cpp_extension feature we have 2 options to proceed:
.sycl
as a file extension specific to PyTorch ecosystem and further influence other communities to align on that.sycl_sources = [ ... ]
variable to take sycl source intorch.utils.cpp_extension.load
(this will be new a new addition, CUDA does not have that)torch.utils.cpp_extension.load
already hascuda_sources
and this PR introducessycl_sources
sources = [...]
andsycl_sources = [ ... ]
variable onclass SyclExtension
(that will be difference vs. how CUDAExtension class is defined)Currently PR follows Option 1. Please, let me know your opinions on the better option.