-
Notifications
You must be signed in to change notification settings - Fork 24.2k
[ONNX] Migrate onnx decomps into PyTorch #146224
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/146224
Note: Links to docs will display an error until the docs builds have been completed. ❌ 1 New FailureAs of commit f625b4d with merge base dd2a943 ( NEW FAILURE - The following job has failed:
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
torch/onnx/_internal/exporter/_torchlib/ops/quantized_decomposed.py
Outdated
Show resolved
Hide resolved
fb355b3
to
d3904c9
Compare
@pytorchbot merge |
Merge startedYour change will be merged once all checks pass (ETA 0-4 Hours). Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
The merge job was canceled or timed out. This most often happen if two merge requests were issued for the same PR, or if merge job was waiting for more than 6 hours for tests to finish. In later case, please do not hesitate to reissue the merge command |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure why skipping the PR sanity check is acceptable here?
We only accept skipping it for codemod and other codegened changes with a provided repro command.
FYI @seemethere @malfet
@@ -1370,7 +1370,7 @@ def _export( | |||
# We do not want to fail because we should still allow users to create | |||
# custom symbolic functions for opset>17 | |||
warnings.warn( | |||
f"Exporting to ONNX opset version {opset_version} is not supported. " | |||
f"Exporting to ONNX opset version {opset_version} is not officially supported. " |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought it was resolved.
There was a problem hiding th 8000 is comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We still don't have symbolic implementations for aten ops that use Reduce* operators. Setting opset>17 will cause models that use those ops to fail
@@ -241,10 +241,10 @@ def get_matching_overload( | |||
for overload in overloads: | |||
assigned_types: dict[str, ir.TypeProtocol] = {} | |||
fail_reason = "" | |||
if not hasattr(overload, "signature"): | |||
if overload.signature is None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if overload is an operator? (operator.mul, ...)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
overload is type OnnxDecompMeta. When it stores an operator, overload.signature is None.
@justinchuby few thoughts:
|
Thanks, @malfet, @albanD . I updated the PR description to address your concerns above and from slack. In short the logic migrated has been part of what a user runs when they call |
Thanks for sharing the details!
|
Thanks! To clarify -
onnxscript core is still a dependency for building the onnx graph, but since the decomp logic is no longer in onnxscript with this proposal, we can freely evolve that with PyTorch without bc constraints.
By old logic if you mean the decomp component: reference to the decomp component is removed completely. Thats the biggest coupling we want to get rid of. Is there other parts of integration you were referring to?
Sounds good. The changes are mostly simplifications of current code and reviewed by the torch.onnx team. For better visibility and reviewability I will do what you suggested by adding the scaffolding and add the decomps in a stack.
Thanks for the suggestions and feedback! I will create a stack that does the above and cc you and @malfet when ready. |
Closed in favor of #147469 |
Migrate the ATen op decomp library for ONNX ("torchlib") from ONNX Script with necessary changes in the onnx registry.
The migration
"torchlib" is what we the decomp library from aten ops to onnx in the
onnxscript
project. (name doesn't matter, can be changed.) It is the improved version of the "symbolic functions" intorch/onnx
implemented usingonnxscript
, a graph builder for onnx. Since PyTorch 2.1, it has been a dependency fortorch.onnx
viaonnxscript
and has been powering thetorch.onnx
exporter.torchlib was hosted in
onnxscript
for rapid evolvement. However, is it time to migrate the logic into PyTorch because:torch.onnx
and is the equivalent of the onnx "symbolic functions" for FX graphstorch.onnx
from logic inonnxscript
, which is a good thing.Test and infra impact
The tests leverage OpInfo. They run in an onnx shard only. On a 2-core machine, tests typically complete within 15 minutes.
No new dependencies are introduced. Packaging, test activities should remain the same.
State of the migrated code
The migrated code are lifted from https://github.com/microsoft/onnxscript/tree/main/onnxscript/function_libs/torch_lib. It is reviewed by the same set of reviewers owning the
torch.onnx
component.Fixes #139301
Next steps
The follow up PRs will decouple the implementation from ONNX Script type system to improve type checking and bump the onnx opset version used.