[ONNX] Support float4 #151069

justinchuby · 2025-04-11T00:19:59Z

Support exporting float4 models (note: currently we use IR version 10 universally in the exporter, which does not include float 4 support. Eventually when onnx runtime and the ecosystem moves to support the new IR version 11 we should bump our version to 11 in the exporter as well)
The shape of the type is set according to add torch.float4_e2m1fn_x2 to PyTorch #148791 (comment) (added last dim with size 2)
Use ml_dtypes types when converting to numpy for consistency with ONNX IR

Fix #150202

pytorch-bot · 2025-04-11T00:20:04Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/151069

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki or our office hours

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 74a431a with merge base 8568dbc ():
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

Copilot

Copilot reviewed 6 out of 6 changed files in this pull request and generated no comments.

Comments suppressed due to low confidence (2)

torch/onnx/_internal/exporter/_type_casting.py:8

The FIXME indicates that the expected shape for unpacked float4x2 tensors is not fully understood. Please add detailed documentation or tests to clarify the intended behavior once determined.

# FIXME: Figure out what the shape really means

test/onnx/exporter/test_core.py:79

[nitpick] Consider adding tests using multi-element tensors to better validate the unpacking and conversion behavior of the float4 implementation.

tensor = _core.TorchTensor(torch.tensor([1], dtype=torch.uint8).view(torch.float4_e2m1fn_x2))

Copilot

Copilot reviewed 7 out of 7 changed files in this pull request and generated no comments.

Comments suppressed due to low confidence (2)

torch/onnx/_internal/exporter/_type_casting.py:10

The unpack_float4x2_as_uint8 function doubles the number of elements from the input tensor, which may conflict with the expected byte size when the result is later viewed as a FLOAT4. Please verify that this doubling combined with the custom numpy dtype conversion produces the intended storage size.

result_size = tensor.numel() * 2

test/onnx/exporter/test_core.py:82

The test expects a one-byte representation (b"\x01") for a tensor converted from FLOAT4, yet the unpacking function generates an array with two elements per tensor element before the final view conversion. Confirm that the custom numpy dtype for FLOAT4 effectively consolidates the two unpacked values into a single byte.

self.assertEqual(tensor.tobytes(), b"\x01")

titaiwangms · 2025-04-14T16:14:58Z

Support exporting float4 models (note: currently we use IR version 10 universally in the exporter, which does not include float 4 support. Eventually when onnx runtime and the ecosystem moves to support the new IR version 11 we should bump our version to 11 in the exporter as well)

Does this mean that this PR allows exporter to generate the model with IR version 10 and float4? Is that in ONNX spec? It looks like this is a pre-realease, but it's available in the exporter already if the PR is merged.

justinchuby · 2025-04-14T16:19:56Z

Does this mean that this PR allows exporter to generate the model with IR version 10 and float4? Is that in ONNX spec?

Admittedly the model will not conform to the spec if ir-version is 10 and has float4. We need to use ir version 11 for this. But this change still allows users to get some model they can work with. In a follow up I can create a warning mentioning the caveats if you agree with the idea.

This PR also has a side effect of addressing microsoft/onnxscript#2187. I will isolate it in a separate PR.

Split the changes from #151069 to address microsoft/onnxscript#2187, where the output np arrays do not have the correct ml_dtypes types as expected. Pull Request resolved: #151259 Approved by: https://github.com/titaiwangms

Copilot

Copilot reviewed 7 out of 7 changed files in this pull request and generated no comments.

…151259) Split the changes from pytorch#151069 to address microsoft/onnxscript#2187, where the output np arrays do not have the correct ml_dtypes types as expected. Pull Request resolved: pytorch#151259 Approved by: https://github.com/titaiwangms

justinchuby · 2025-05-15T17:53:08Z

@pytorchbot rebase

pytorchmergebot · 2025-05-15T17:54:45Z

@pytorchbot started a rebase job onto refs/remotes/origin/viable/strict. Check the current status here

pytorchmergebot · 2025-05-15T17:54:46Z

Rebase failed due to Command git -C /home/runner/work/pytorch/pytorch rebase refs/remotes/origin/viable/strict pull/151069/head returned non-zero exit code 1

Rebasing (1/6)
Auto-merging torch/onnx/_internal/exporter/_core.py
CONFLICT (content): Merge conflict in torch/onnx/_internal/exporter/_core.py
Auto-merging torch/onnx/_internal/exporter/_dispatching.py
error: could not apply 1aaa93084f7... [ONNX] Support float4
hint: Resolve all conflicts manually, mark them as resolved with
hint: "git add/rm <conflicted_files>", then run "git rebase --continue".
hint: You can instead skip this commit: run "git rebase --skip".
hint: To abort and get back to the state before "git rebase", run "git rebase --abort".
hint: Disable this message with "git config set advice.mergeConflict false"
Could not apply 1aaa93084f7... [ONNX] Support float4

Raised by https://github.com/pytorch/pytorch/actions/runs/15051848411

justinchuby · 2025-05-18T00:47:25Z

@pytorchbot merge

pytorchmergebot · 2025-05-18T00:49:17Z

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

- Support exporting float4 models (note: currently we use IR version 10 universally in the exporter, which does not include float 4 support. Eventually when onnx runtime and the ecosystem moves to support the new IR version 11 we should bump our version to 11 in the exporter as well) - The shape of the type is set according to #148791 (comment) (added last dim with size 2) - Use ml_dtypes types when converting to numpy for consistency with ONNX IR Fix #150202 Pull Request resolved: #151069 Approved by: https://github.com/titaiwangms

justinchuby requested review from titaiwangms, shubhambhokare1 and wschin as code owners April 11, 2025 00:20

pytorch-bot bot added the release notes: onnx torch.onnx related changes that should show up in the release notes label Apr 11, 2025

justinchuby requested a review from Copilot April 11, 2025 00:20

justinchuby added module: onnx Related to torch.onnx topic: new features topic category labels Apr 11, 2025

Copilot AI reviewed Apr 11, 2025

View reviewed changes

pytorchbot added the open source label Apr 11, 2025

justinchuby requested a review from Copilot April 11, 2025 21:09

Copilot AI reviewed Apr 11, 2025

View reviewed changes

soulitzer added the triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module label Apr 11, 2025

justinchuby added 6 commits April 12, 2025 17:51

[ONNX] Support float4

1aaa930

lint

a5f02b1

Test

29d6427

test

6569047

doc and lint

286fe09

test shape

6076f8e

justinchuby force-pushed the justinchu/float4 branch from f218a47 to 6076f8e Compare April 13, 2025 00:51

justinchuby added the ciflow/trunk Trigger trunk jobs on your pull request label Apr 13, 2025

justinchuby mentioned this pull request Apr 14, 2025

Optimizer constant folding turns bfloat16 initializers into UINT16 microsoft/onnxscript#2187

Closed

justinchuby marked this pull request as draft April 14, 2025 19:39

justinchuby mentioned this pull request Apr 14, 2025

[ONNX] Produce correct dtypes for bf16/f8 in IR TorchTensor #151259

Closed

justinchuby marked this pull request as ready for review April 16, 2025 03:41

Merge branch 'main' into justinchu/float4

016340a

justinchuby requested a review from Copilot April 16, 2025 03:49

Copilot AI reviewed Apr 16, 2025

View reviewed changes

titaiwangms approved these changes May 15, 2025

View reviewed changes

Merge branch 'main' into justinchu/float4

74a431a

pytorchmergebot added the merging label May 18, 2025

pytorchmergebot added the Merged label May 18, 2025

pytorchmergebot closed this in 0e805aa May 18, 2025

pytorchmergebot removed the merging label May 18, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[ONNX] Support float4 #151069

[ONNX] Support float4 #151069

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

[ONNX] Support float4 #151069

[ONNX] Support float4 #151069

Conversation

Uh oh!

Uh oh!

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/151069

✅ No Failures

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Merge started

Uh oh!

Uh oh!