Allow compressed displacement to be used when profitable #117288

tannergooding · 2025-07-03T19:20:55Z

This should save 1-byte of encoding space as compared to using imm32, when applicable.

dotnet-policy-service · 2025-07-03T19:21:48Z

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch
See info in area-owners.md if you want to be subscribed.

Copilot

Pull Request Overview

This PR adds logic in emitter::TakesEvexPrefix to detect when a memory displacement can be compressed to 8 bits via EVEX and only emits the prefix when profitable, saving encoding space.

Compute stack or address displacement and check if it fits in a signed byte
Call TryEvexCompressDisp8Byte for larger displacements and return true when compression succeeds

Comments suppressed due to low confidence (1)

src/coreclr/jit/emitxarch.cpp:1926

No tests were added to verify the new compressed displacement logic (TryEvexCompressDisp8Byte); consider adding unit tests covering both compressible and non-compressible displacement cases to ensure correct behavior.

            dsp = TryEvexCompressDisp8Byte(id, dsp, &dspInByte);

src/coreclr/jit/emitxarch.cpp

tannergooding · 2025-07-05T20:02:16Z

CC. @dotnet/jit-contrib This is generally ready for review and the diffs look overall good.

Linux barely has any asm diffs due to having no callee saved SIMD registers and different passing semantics. This causes its spilling behaviors and general usage patterns to differ, so there aren't as many opportunities. However, it has significant TP improvements because the relevant checks are no longer happening when they can't benefit

Windows however has significant asm diffs and it has improved TP in cases where there is little to no SIMD usage or no ability to use compressed-displacement, but a TP hit (namely in the SIMD heavy tests/benchmarks) when it does. Showcasing that the feature is pay for play.

~~There are notably a couple size regressions I'm looking at, since we shouldn't have any places that are now allocating "more" bytes than before. But this is likely a minor tweak to the current code.~~ Figured this out. emitInsSizeSV uses UNATIVE_OFFSET for the offs calculation, while emitOutputSV uses ssize_t. We need to preserve the cast to int in emitInsSizeSV to ensure the relevant sign extension occurs and we check the right value. -- This should probably be rewritten to use ssize_t, like other places in emitxarch do, particularly since it would allow simplifying some of the other logic; but that should be a follow up PR

…S scenarios

tannergooding · 2025-07-06T02:13:03Z

Linux

ASM Diffs:

Overall (-148 bytes)
FullOpts (-148 bytes)

Throughput Impact:

Overall (-0.24% to -0.06%)
MinOpts (-0.59% to -0.35%)
FullOpts (-0.15% to -0.06%)

Windows

ASM Diffs:

Overall (-1,547,961 bytes)
MinOpts (-1,246,735 bytes)
FullOpts (-301,226 bytes)

Throughput Impact

Overall (-0.17% to +0.03%)
MinOpts (-0.63% to +0.13%)
FullOpts (-0.17% to +0.02%)

For most cases, we see reductions in the actual instruction groups because we are now choosing to emit the EVEX encoding to get the 1-byte savings from using compressed displacement where profitable.

In a few cases, however, we see no change to the Total bytes of code but rather a reduction in the allocated bytes for code instead. This is because we are more correctly predicting how many bytes are required which allows each method to allocate less.

For a few methods, the reduction in Total bytes of code also results in further savings because many jumps can become SHORT where they previously were not.

There are a couple of T0 methods where the total bytes of code has regressed (gone up). This is due to the quirks mentioned above in emitInsSizeSV. A separate PR should be done to clean that up and make it simpler/more inline with how emitOutputSV is actually consuming the prediction.

Windows x86 sees less improvement due to it not having FEATURE_FIXED_OUT_ARGS. This causes it to need to dynamically adjust the offset based on the emitCurStackLvl. There's some quirks in how the SV paths handle this, mentioned above, and just some general complexity in this blocking the ability to decide if compressed displacement can be used up front which makes the feature not very pay for play. As such, we pessimize to not using compressed displacement in such scenarios so there are a few more methods with regressions (but overall still by far an improvement).

…ARGS isn't available

EgorBo

LGTM. Wo 8000 rth running some outerloops?

tannergooding · 2025-07-07T14:37:54Z

/azp run runtime-coreclr jitstress-isas-x86, Fuzzlyn, Antigen, runtime-coreclr jitstress, runtime-coreclr jitstressregs

azure-pipelines · 2025-07-07T14:38:19Z

Azure Pipelines successfully started running 5 pipeline(s).

…utputSV

tannergooding · 2025-07-07T19:47:26Z

8000

/azp run runtime-coreclr jitstress-isas-x86, Fuzzlyn, Antigen, runtime-coreclr jitstress, runtime-coreclr jitstressregs

azure-pipelines · 2025-07-07T19:47:54Z

Azure Pipelines successfully started running 5 pipeline(s).

Copilot AI review requested due to automatic review settings July 3, 2025 19:20

github-actions bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Jul 3, 2025

dotnet-policy-service bot assigned tannergooding Jul 3, 2025

Copilot AI reviewed Jul 3, 2025

View reviewed changes

src/coreclr/jit/emitxarch.cpp Outdated Show resolved Hide resolved

tannergooding force-pushed the preferDisp8 branch 2 times, most recently from 5fcafd8 to 5344cbd Compare July 4, 2025 05:31

tannergooding force-pushed the preferDisp8 branch 2 times, most recently from 4f30736 to a96769c Compare July 4, 2025 20:14

Allow compressed displacement to be used when profitable

26c77e6

tannergooding force-pushed the preferDisp8 branch from a96769c to 15d599c Compare July 4, 2025 23:56

build-analysis bot mentioned this pull request Jul 5, 2025

LibraryImportGenerator.Unit.Tests crashing on linux-x64 mono interpreter #100800

Open

tannergooding mentioned this pull request Jul 5, 2025

Improve the emitter support for APX instructions #117326

Open

tannergooding force-pushed the preferDisp8 branch from 15d599c to c017051 Compare July 5, 2025 01:26

This was referenced Jul 5, 2025

browser-wasm windows Debug AllSubsets_CoreCLR builds failing in emcc seemingly unrelated to any code issues #116647

Closed

Occasional failure in "browser-wasm windows Release LibraryTests: Build Product" #116671

Closed

tannergooding force-pushed the preferDisp8 branch 5 times, most recently from 25acfbf to e75ba7f Compare July 5, 2025 17:37

Try to reduce the cost for compressed displacement support

dfc08c9

tannergooding force-pushed the preferDisp8 branch from e75ba7f to dfc08c9 Compare July 5, 2025 18:34

tannergooding requested a review from EgorBo July 5, 2025 19:49

tannergooding force-pushed the preferDisp8 branch from 182f73e to 9d66a2b Compare July 5, 2025 21:26

Ensure that we check for compressed displacement using the signed value

c8df8ef

tannergooding force-pushed the preferDisp8 branch from 9d66a2b to c8df8ef Compare July 5, 2025 21:43

Don't try and do compressed displacement under !FEATURE_FIXED_OUT_ARG…

53ec34f

…S scenarios

Don't assert compressed displacement was used when FEATURE_FIXED_OUT_…

bccd0a4

…ARGS isn't available

EgorBo approved these changes Jul 7, 2025

View reviewed changes

Merge branch 'main' into preferDisp8

30f1e2a

Don't assert we're definitely using compressed displacement for emitO…

02defed

…utputSV

tannergooding merged commit c2dbdc9 into dotnet:main Jul 7, 2025
155 of 174 checks passed

tannergooding deleted the preferDisp8 branch July 7, 2025 23:46

tannergooding mentioned this pull request Jul 8, 2025

Refactor emitInsSizeSVCalcDisp to more closely match the emitOutputSV checks #117430

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Allow compressed displacement to be used when profitable #117288

Allow compressed displacement to be used when profitable #117288

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Allow compressed displacement to be used when profitable #117288

Allow compressed displacement to be used when profitable #117288

Uh oh!

Conversation

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Pull Request Overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Linux

Windows

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!