-
-
Notifications
You must be signed in to change notification settings - Fork 32k
gh-106149: Simplify stack depth calculation. Replace asserts by exceptions. #107255
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think converting asserts to exceptions is generally good, cleaning up the unused code_flags
argument to OptimizeCodeUnit is clearly good, (these two could have been done independently for somewhat simpler review?) and I think it's also good to move the stackdepth check earlier (so as to eliminate the special case for async funcs), but I am not sure what the rationale is for moving the stackdepth check inside OptimizeCodeUnit
. It seems surprising to me, because stackdepth checking is not optimization and is not required for any other part of optimization (even in this PR it happens last), so it feels more like "eh, may as well stash this here for an ounce of convenience" than a logical organization.
It also requires smuggling the stackdepth through a new field in CfgBuilder
struct, which is not terrible, but it leads to making the PyCfg_Stackdepth
API oddly redundant and non-obviously destructive (see inline comment.) And I just don't see what is gained.
Python/flowgraph.c
Outdated
{ | ||
g->g_maxdepth = -1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it is surprising that _PyCfg_Stackdepth
would both mutate the cfg builder and return the stack depth. I think it should do only one or the other, and if the choice is "mutate" then it may also warrant a different name (_PyCfg_SetStackDepth
).
I take your point about this not being related to optimization. The reason I wanted to move it there was so that it can be local to flowgraph.c. But then I found that I can't do that anyway because of the unit test function (_PyCompile_Assemble). So I might as well move the call back to compile.c. |
…evious stackdepth) (no it doesn't)" This reverts commit 1cea1e4.
I would still like to (in a future PR) move more of the code that manipulates a CFG to flowgraph.c. We could have a function that that does more than just optimize (so it calls _PyCfg_OptimizeCodeUnit and then does the rest of the stuff up to converting to an instr list). I'll think about it. |
Yeah, sorry. I discovered these issues while working on the main part. Let me know if it's still useful to split them out. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good!
int maxdepth = 0; | ||
basicblock **sp = stack; | ||
if (code_flags & (CO_GENERATOR | CO_COROUTINE | CO_ASYNC_GENERATOR)) { | ||
stackdepth_push(&sp, entryblock, 1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm confused about this. Why is it 1 if the RETURN_GENERATOR+POP_TOP add up to 0?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if it's somehow related to the fact that stack_effect() returns 0 for RETURN_GENERATOR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added some prints to the old and new versions, to see how the stackdepth calculation proceeds. So here is the output for def f(): yield 12
.
Old:
1 [0] opcode = 75 depth = 1 effect = 0 new_depth = 1 maxdepth = 1
1 [1] opcode = 1 depth = 1 effect = -1 new_depth = 0 maxdepth = 1
1 [2] opcode = 257 depth = 0 effect = 0 new_depth = 0 maxdepth = 1
2 [2] opcode = 257 depth = 0 effect = 2 target_depth = 2 maxdepth = 2
1 [3] opcode = 151 depth = 0 effect = 0 new_depth = 0 maxdepth = 2
1 [4] opcode = 100 depth = 0 effect = 1 new_depth = 1 maxdepth = 2
1 [5] opcode = 150 depth = 1 effect = 0 new_depth = 1 maxdepth = 2
1 [6] opcode = 151 depth = 1 effect = 0 new_depth = 1 maxdepth = 2
1 [7] opcode = 1 depth = 1 effect = -1 new_depth = 0 maxdepth = 2
1 [8] opcode = 121 depth = 0 effect = 0 new_depth = 0 maxdepth = 2
1 [0] opcode = 173 depth = 2 effect = 0 new_depth = 2 maxdepth = 2
1 [1] opcode = 119 depth = 2 effect = -1 new_depth = 1 maxdepth = 2
1 [0] opcode = 151 depth = 0 effect = 0 new_depth = 0 maxdepth = 0
1 [1] opcode = 100 depth = 0 effect = 1 new_depth = 1 maxdepth = 1
1 [2] opcode = 24 depth = 1 effect = 0 new_depth = 1 maxdepth = 1
1 [3] opcode = 90 depth = 1 effect = -1 new_depth = 0 maxdepth = 1
1 [4] opcode = 121 depth = 0 effect = 0 new_depth = 0 maxdepth = 1
New:
1 [0] opcode = 257 depth = 0 effect = 0 new_depth = 0 maxdepth = 0
2 [0] opcode = 257 depth = 0 effect = 2 target_depth = 2 maxdepth = 2
1 [1] opcode = 151 depth = 0 effect = 0 new_depth = 0 maxdepth = 2
1 [2] opcode = 100 depth = 0 effect = 1 new_depth = 1 maxdepth = 2
1 [3] opcode = 150 depth = 1 effect = 0 new_depth = 1 maxdepth = 2
1 [4] opcode = 151 depth = 1 effect = 0 new_depth = 1 maxdepth = 2
1 [5] opcode = 1 depth = 1 effect = -1 new_depth = 0 maxdepth = 2
1 [6] opcode = 121 depth = 0 effect = 0 new_depth = 0 maxdepth = 2
1 [0] opcode = 173 depth = 2 effect = 0 new_depth = 2 maxdepth = 2
1 [1] opcode = 119 depth = 2 effect = -1 new_depth = 1 maxdepth = 2
1 [0] opcode = 151 depth = 0 effect = 0 new_depth = 0 maxdepth = 0
1 [1] opcode = 100 depth = 0 effect = 1 new_depth = 1 maxdepth = 1
1 [2] opcode = 24 depth = 1 effect = 0 new_depth = 1 maxdepth = 1
1 [3] opcode = 90 depth = 1 effect = -1 new_depth = 0 maxdepth = 1
1 [4] opcode = 121 depth = 0 effect = 0 new_depth = 0 maxdepth = 1
So the old code looks at those two instructions which are now added after the stackdepth calculation. And indeed stackdepth(RETURN_GENERATOR) returns 0, so the additional 1 in the initialisation compensated for that. So the old code is a messy hack.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wondered the same thing! Thanks for tracking this down. Separately, is there any reason not to fix num_pushed metadata for RETURN_GENERATOR
? I guess with this PR it now hardly matters what that metadata says for it...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The problem is that the stack effect is derived from the bytecodes.c file, from the signature of the bytecode. If we add stuff to the signature the code generator will do other stuff with it.
* It should always be true, because at least one expression is | ||
* required to turn a function into a generator. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure this explanation is accurate? This function is a generator:
async def f():
return
Here is the disassembly of it:
>>> dis.dis(f)
1 0 RETURN_GENERATOR
2 POP_TOP
4 RESUME 0
2 6 RETURN_CONST 0 (None)
>> 8 CALL_INTRINSIC_1 3 (INTRINSIC_STOPITERATION_ERROR)
10 RERAISE 1
ExceptionTable:
4 to 6 -> 8 [0] lasti
I think if RETURN_CONST
didn't exist, it would be enough to say "all functions must return a value, which always must go on the stack before it is returned." But with RETURN_CONST
that's not true either. It really does seem like the StopIteration
handler is what we are relying on here.
Move the stackdepth calculation to the optimizer, where it occurs before the RETURN_GENERATOR instruction is inserted (so it doesn't need to worry about its stack_effect value being wrong in the metadata). This makes fewer function need to check the code_flags and special-case generators.
Also replaces assertions by exceptions, because these error cases can now show up as unit test inputs. Adds tests to make them fail, and updates tests that would otherwise fail now because they hit the assertions following the stackdepth change.