8000 Add new MIR constant propagation based on dataflow analysis by jachris · Pull Request #101168 · rust-lang/rust · GitHub
[go: up one dir, main page]

Skip to content

Add new MIR constant propagation based on dataflow analysis #101168

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 76 commits into from
Nov 15, 2022

Conversation

jachris
Copy link
Contributor
@jachris jachris commented Aug 29, 2022

The current constant propagation in rustc_mir_transform/src/const_prop.rs fails to handle many cases that would be expected from a constant propagation optimization. For example:

let x = if true { 0 } else { 0 };

This pull request adds a new constant propagation MIR optimization pass based on the existing dataflow analysis framework. Since most of the analysis is not unique to constant propagation, a generic framework has been extracted. It works on top of the existing framework and could be reused for other optimzations.

Closes #80038. Closes #81605.

Todo

Essential

  • Writes to inactive enum variants. Resolved by rejecting the registration of places with downcast projections for now. Could be improved by flooding other variants if mutable access to a variant is observed.
  • Handle StatementKind::CopyNonOverlapping. Resolved by flooding the destination.
  • Handle UnsafeCell / !Freeze correctly.
  • Overflow propagation of CheckedBinaryOp: Decided to not propagate if overflow flag is true (false will still be propagated)
  • More documentation in general.
  • Arguments for correctness, documentation of necessary assumptions.
  • Better performance, or alternatively, require -Zmir-opt-level=3 for now.

Extra

  • Add explicit unreachability, i.e. upgrading the lattice from $\mathbb{P} \to \mathbb{V}$ to $\set{\bot} \cup (\mathbb{P} \to \mathbb{V})$.
  • Use storage statements to improve precision.
  • Consider opening issue for duplicate diagnostics: Add new MIR constant propagation based on dataflow analysis #101168 (comment)
  • Flood moved-from places with $\bot$ (requires some changes for places with tracked projections).
  • Add downcast projections back in.
  • Algebraic simplifications (possibly with a shared API; done by old const prop).
  • Propagation through slices / arrays.
  • Find other optimizations that are done by old const_prop.rs, but not by this one.

@rustbot rustbot added the T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. label Aug 29, 2022
@rust-highfive
Copy link
Contributor

Thanks for the pull request, and welcome! The Rust team is excited to review your changes, and you should hear from @compiler-errors (or someone else) soon.

Please see the contribution instructions for more information.

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Aug 29, 2022
@jachris
Copy link
Contributor Author
jachris commented Aug 29, 2022

@rustbot label +A-mir-opt

@rustbot rustbot added the A-mir-opt Area: MIR optimizations label Aug 29, 2022
@rust-log-analyzer

This comment has been minimized.

@rust-log-analyzer

This comment has been minimized.

@jachris jachris force-pushed the dataflow-const-prop branch from 648fc1a to 77d6fad Compare August 29, 2022 22:06
@rust-log-analyzer

This comment has been minimized.

@compiler-errors
Copy link
Member

r? @oli-obk since you know about const prop

Copy link
Contributor
@JakobDegen JakobDegen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for doing this! This is impressive work, and looks to be exactly the direction that I think we should be going. For now, I've only focused on correctness in the review. Once we're confident with that, we can work on everything else.

Besides the one bug I pointed out below, there is a part of the logic here that I am very suspicious of: The place expressions that are tracked might in general be overlapping each other. This can happen either in two cases:

  1. Enums. Assuming that the layouts work out, the below MIR is DB and would be miscompiled by this pass, if I am reading the code correctly:

    (_2 as Foo).0 = 5;
    (_2 as Bar).0 = 5;
    _0 = (_2 as Foo).0;
    return

    It is not possible to generate this kind of MIR from Rust (I am working on something to be able to write it in tests at least), but it is nonetheless a part of the semantics of MIR.

  2. Pointer aliasing. This is the complicated one. The place values that we care about might in general be aliased by pointers, and we also const prop through pointers in the logic here. There is a lot of subtlety to this. For example, this code is "miscompiled:"

    #[inline(never)]
    fn foo() -> usize {
       let mut x = 0;
        let p = core::ptr::addr_of_mut!(x);
        x = 1;
        unsafe { *p = 2 };
        x
    }
    
    fn main() {
        dbg!(foo());
    }

    Miri does indeed report UB for this. Putting aside for a minute the issue of whether or not we're ok miscompiling this, there needs to be a comment in the code somewhere that states what assumptions we are making about the aliasing model that are broken by the above code.

Speaking somewhat more generally, the code that is currently responsible for handling writes needs to explain why its behavior is correct not just for the "obvious" target but for all other possibly overlapping places.

@jachris
Copy link
Contributor Author
jachris commented Aug 30, 2022

Besides the one bug I pointed out below, there is a part of the logic here that I am very suspicious of: The place expressions that are tracked might in general be overlapping each other. This can happen either in two cases:

  1. Enums. Assuming that the layouts work out, the below MIR is DB and would be miscompiled by this pass, if I am reading the code correctly: [...]

This is unexpected, but I guess it can make sense to allow such writes to inactive variants. If we don't want to deal with layouts, we could flood all other variants with $\top$ when writing through a downcast projection. And we perhaps need to do the same if a mutable reference/pointer to a field of a variant escapes.

  1. Pointer aliasing. This is the complicated one. The place values that we care about might in general be aliased by pointers, and we also const prop through pointers in the logic here. There is a lot of subtlety to this. For example, this code is "miscompiled:" [...]

Miri does indeed report UB for this. Putting aside for a minute the issue of whether or not we're ok miscompiling this, there needs to be a comment in the code somewhere that states what assumptions we are making about the aliasing model that are broken by the above code.

Speaking somewhat more generally, the code that is currently responsible for handling writes needs to explain why its behavior is correct not just for the "obvious" target but for all other possibly overlapping places.

The analysis relies on some properties of Stacked Borrows. I agree that the assumptions should be clearly documented. I'm also not quite convinced yet that the way that taking a mutable reference/pointer is handled is sound (it just floods the place with $\top$). There should probably be a comment for every step in the analysis (including writes, but also for taking references and so on).