8000 gh-106149: Simplify stack depth calculation. Replace asserts by exceptions. by iritkatriel · Pull Request #107255 · python/cpython · GitHub
[go: up one dir, main page]

Skip to content

gh-106149: Simplify stack depth calculation. Replace asserts by exceptions. #107255

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Jul 26, 2023

Conversation

iritkatriel
Copy link
Member
@iritkatriel iritkatriel commented Jul 25, 2023

Move the stackdepth calculation to the optimizer, where it occurs before the RETURN_GENERATOR instruction is inserted (so it doesn't need to worry about its stack_effect value being wrong in the metadata). This makes fewer function need to check the code_flags and special-case generators.

Also replaces assertions by exceptions, because these error cases can now show up as unit test inputs. Adds tests to make them fail, and updates tests that would otherwise fail now because they hit the assertions following the stackdepth change.

Copy link
Member
@carljm carljm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think converting asserts to exceptions is generally good, cleaning up the unused code_flags argument to OptimizeCodeUnit is clearly good, (these two could have been done independently for somewhat simpler review?) and I think it's also good to move the stackdepth check earlier (so as to eliminate the special case for async funcs), but I am not sure what the rationale is for moving the stackdepth check inside OptimizeCodeUnit. It seems surprising to me, because stackdepth checking is not optimization and is not required for any other part of optimization (even in this PR it happens last), so it feels more like "eh, may as well stash this here for an ounce of convenience" than a logical organization.

It also requires smuggling the stackdepth through a new field in CfgBuilder struct, which is not terrible, but it leads to making the PyCfg_Stackdepth API oddly redundant and non-obviously destructive (see inline comment.) And I just don't see what is gained.

{
g->g_maxdepth = -1;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is surprising that _PyCfg_Stackdepth would both mutate the cfg builder and return the stack depth. I think it should do only one or the other, and if the choice is "mutate" then it may also warrant a different name (_PyCfg_SetStackDepth).

@iritkatriel
Copy link
Member Author

I take your point about this not being related to optimization. The reason I wanted to move it there was so that it can be local to flowgraph.c. But then I found that I can't do that anyway because of the unit test function (_PyCompile_Assemble). So I might as well move the call back to compile.c.

@iritkatriel
Copy link
Member Author

I would still like to (in a future PR) move more of the code that manipulates a CFG to flowgraph.c. We could have a function that that does more than just optimize (so it calls _PyCfg_OptimizeCodeUnit and then does the rest of the stuff up to converting to an instr list). I'll think about it.

@iritkatriel
Copy link
Member Author

(these two could have been done independently for somewhat simpler review?)

Yeah, sorry. I discovered these issues while working on the main part. Let me know if it's still useful to split them out.

Copy link
Member
@carljm carljm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good!

int maxdepth = 0;
basicblock **sp = stack;
if (code_flags & (CO_GENERATOR | CO_COROUTINE | CO_ASYNC_GENERATOR)) {
stackdepth_push(&sp, entryblock, 1);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm confused about this. Why is it 1 if the RETURN_GENERATOR+POP_TOP add up to 0?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if it's somehow related to the fact that stack_effect() returns 0 for RETURN_GENERATOR?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added some prints to the old and new versions, to see how the stackdepth calculation proceeds. So here is the output for def f(): yield 12.

Old:

1 [0]  opcode = 75  depth = 1 effect = 0  new_depth = 1  maxdepth = 1
1 [1]  opcode = 1  depth = 1 effect = -1  new_depth = 0  maxdepth = 1
1 [2]  opcode = 257  depth = 0 effect = 0  new_depth = 0  maxdepth = 1
2 [2]  opcode = 257  depth = 0 effect = 2  target_depth = 2  maxdepth = 2
1 [3]  opcode = 151  depth = 0 effect = 0  new_depth = 0  maxdepth = 2
1 [4]  opcode = 100  depth = 0 effect = 1  new_depth = 1  maxdepth = 2
1 [5]  opcode = 150  depth = 1 effect = 0  new_depth = 1  maxdepth = 2
1 [6]  opcode = 151  depth = 1 effect = 0  new_depth = 1  maxdepth = 2
1 [7]  opcode = 1  depth = 1 effect = -1  new_depth = 0  maxdepth = 2
1 [8]  opcode = 121  depth = 0 effect = 0  new_depth = 0  maxdepth = 2
1 [0]  opcode = 173  depth = 2 effect = 0  new_depth = 2  maxdepth = 2
1 [1]  opcode = 119  depth = 2 effect = -1  new_depth = 1  maxdepth = 2
1 [0]  opcode = 151  depth = 0 effect = 0  new_depth = 0  maxdepth = 0
1 [1]  opcode = 100  depth = 0 effect = 1  new_depth = 1  maxdepth = 1
1 [2]  opcode = 24  depth = 1 effect = 0  new_depth = 1  maxdepth = 1
1 [3]  opcode = 90  depth = 1 effect = -1  new_depth = 0  maxdepth = 1
1 [4]  opcode = 121  depth = 0 effect = 0  new_depth = 0  maxdepth = 1

New:

1 [0]  opcode = 257  depth = 0 effect = 0  new_depth = 0  maxdepth = 0
2 [0]  opcode = 257  depth = 0 effect = 2  target_depth = 2  maxdepth = 2
1 [1]  opcode = 151  depth = 0 effect = 0  new_depth = 0  maxdepth = 2
1 [2]  opcode = 100  depth = 0 effect = 1  new_depth = 1  maxdepth = 2
1 [3]  opcode = 150  depth = 1 effect = 0  new_depth = 1  maxdepth = 2
1 [4]  opcode = 151  depth = 1 effect = 0  new_depth = 1  maxdepth = 2
1 [5]  opcode = 1  depth = 1 effect = -1  new_depth = 0  maxdepth = 2
1 [6]  opcode = 121  depth = 0 effect = 0  new_depth = 0  maxdepth = 2
1 [0]  opcode = 173  depth = 2 effect = 0  new_depth = 2  maxdepth = 2
1 [1]  opcode = 119  depth = 2 effect = -1  new_depth = 1  maxdepth = 2
1 [0]  opcode = 151  depth = 0 effect = 0  new_depth = 0  maxdepth = 0
1 [1]  opcode = 100  depth = 0 effect = 1  new_depth = 1  maxdepth = 1
1 [2]  opcode = 24  depth = 1 effect = 0  new_depth = 1  maxdepth = 1
1 [3]  opcode = 90  depth = 1 effect = -1  new_depth = 0  maxdepth = 1
1 [4]  opcode = 121  depth = 0 effect = 0  new_depth = 0  maxdepth = 1

So the old code looks at those two instructions which are now added after the stackdepth calculation. And indeed stackdepth(RETURN_GENERATOR) returns 0, so the additional 1 in the initialisation compensated for that. So the old code is a messy hack.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wondered the same thing! Thanks for tracking this down. Separately, is there any reason not to fix num_pushed metadata for RETURN_GENERATOR? I guess with this PR it now hardly matters what that metadata says for it...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem is that the stack effect is derived from the bytecodes.c file, from the signature of the bytecode. If we add stuff to the signature the code generator will do other stuff with it.

@iritkatriel iritkatriel merged commit b0202a4 into python:main Jul 26, 2023
Comment on lines +7738 to +7739
* It should always be true, because at least one expression is
* required to turn a function into a generator.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure this explanation is accurate? This function is a generator:

async def f():
    return

Here is the disassembly of it:

>>> dis.dis(f)
  1           0 RETURN_GENERATOR
              2 POP_TOP
              4 RESUME                   0

  2           6 RETURN_CONST             0 (None)
        >>    8 CALL_INTRINSIC_1         3 (INTRINSIC_STOPITERATION_ERROR)
             10 RERAISE                  1
ExceptionTable:
  4 to 6 -> 8 [0] lasti

I think if RETURN_CONST didn't exist, it would be enough to say "all functions must return a value, which always must go on the stack before it is returned." But with RETURN_CONST that's not true either. It really does seem like the StopIteration handler is what we are relying on here.

jtcave pushed a commit to jtcave/cpython that referenced this pull request Jul 27, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants
0