-
-
Notifications
You must be signed in to change notification settings - Fork 2.9k
[mypyc] Simplify generated code for native attribute get #11978
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@@ -88,8 +88,22 @@ def generate_native_function(fn: FuncIR, | |||
next_block = blocks[i + 1] | |||
body.emit_label(block) | |||
visitor.next_block = next_block | |||
for op in block.ops: | |||
ops = block.ops |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could probably use a comment of what we are keeping track of here now that the loop is a lot more complex
mypyc/codegen/emitfunc.py
Outdated
@@ -111,6 +125,10 @@ def __init__(self, | |||
self.literals = emitter.context.literals | |||
self.rare = False | |||
self.next_block: Optional[BasicBlock] = None | |||
# If not None, the following op is a branch (to allow op merging) | |||
self.next_branch: Optional[Branch] = None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that if we're going to go down the route of having the emitter visitor do peephole opts, we should implement it for the general case instead this special case.
One approach that follows this in relying on state in the visitor would be:
- Stash the current block's ops in a
block_ops
attribute or something - And the next loop index into
block_idx
Then the top level emit loop just usesvisitor.block_idx
as its loop counter and doesn't need any knowledge of when optimizations trigger, and the peephole code in the emitter just looks atself.block_ops[self.block_idx]
directly and then increments it if the optimization fires.
Another similar approach would be to do that but to have the visitor return Optional[int]
to indicate skipping instructions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I followed your idea and I think that it looks cleaner now.
The new output looks so much nicer. |
The Windows failures don't seem specific to this PR, since everything seems to be failing right now (#12460), so I will go ahead and merge this. |
Originally
c.b
generated code like this:Now we generate code like this:
The implementation merges consecutive
GetAttr
andBranch
ops.The main benefit is that this makes the generated code smaller and easier to read,
making it easier to spot possible improvements in the code.
Benchmark results seem to be roughly the same as before, since the original
code could be optimized well by C compilers. I tried richards, deltablue and hexiom
benchmarks using clang and gcc, and the performance delta is from -2% to 2%,
and the differences look pretty random.