10000 GH-130415: Narrow `str` to `""` based on boolean tests by fluhus · Pull Request #130476 · python/cpython · GitHub
[go: up one dir, main page]

Skip to content

GH-130415: Narrow str to "" based on boolean tests #130476

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 9 commits into from
Mar 4, 2025
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
Add optimization path to _TO_BOOL_STR
  • Loading branch information
fluhus committed Mar 3, 2025
commit fcf411621544952cb4224e5eb81b4f41de6f97cd
13 changes: 13 additions & 0 deletions Python/optimizer_bytecodes.c
Original file line number Diff line number Diff line change
Expand Up @@ -429,6 +429,19 @@ dummy_func(void) {
res = sym_new_type(ctx, &PyBool_Type);
sym_set_type(value, &PyUnicode_Type);
}
if (!sym_is_const(value)) {
assert(sym_matches_type(value, &PyUnicode_Type));
int next_opcode = (this_instr + 1)->opcode;
assert(next_opcode == _CHECK_VALIDITY_AND_SET_IP);
next_opcode = (this_instr + 2)->opcode;
// If the next uop is a guard, we can narrow value. However, we
// *can't* narrow res, since that would cause the guard to be
// removed and the narrowed value to be invalid:
if (next_opcode == _GUARD_IS_FALSE_POP) {
sym_set_const(value, Py_GetConstant(Py_CONSTANT_EMPTY_STR));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is strictly incorrect. We don't know that value is "" until after the _GUARD_IS_FALSE_POP.
The reason that matters is that when we start attaching type information to side exits, as we probably will in 3.15, then this could lead us to infer that value is "" on both branches. Which would be wrong.

There are two possible fixes for this.

  • Combine TO_BOOL_STR and _GUARD_IS_FALSE_POP/_GUARD_IS_TRUE_POP into a single (super)instruction, then optimize that.
  • Annotate the bool value resulting from the TO_BOOL with its input, then in _GUARD_IS_FALSE_POP convert the input value to TO_BOOL.

I prefer the second option, although it may be more work, as it is more flexible and can be extended more easily.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, @Fidget-Spinner and I suggested something like the latter on the issue (new symbols like JitBoolOf(JitOptSymbol *source, bool inverted) and JitEqualTo(JitOptSymbol *lhs, JitOptSymbol *rhs, bool inverted)). That's probably the direction we're headed in longer term.

However, I don't think we should let perfect be the enemy of good here. We have nice, working optimizations in these PRs; just because we might sink info onto side exits in the future probably shouldn't prevent us from making changes like this now for 3.14, which are perfectly correct for the current optimizer (which doesn't sink anything).

I'm inclined to land these changes and other similar ones for ==/!= now, and make the symbolic representation of derived boolean values more complex later as an improvement (it will also be able to handle more uncommon cases like x = y == 42; if x: ...). I'm really worried that if we try to "future-proof" optimizations based on what we could do six months from now, it will prevent actual improvements in the near term.

But I'll defer to you here. If having value be narrowed one uop too early in the instruction stream is enough to block this PR, I can work with these new contributors on the more complex solution. But as-is, this has no bugs and works as intended. We don't sink value info onto side exits, so it's correct.

res = sym_new_type(ctx, &PyUnicode_Type);
}
}
}

op(_UNARY_NOT, (value -- res)) {
Expand Down
14 changes: 14 additions & 0 deletions Python/optimizer_cases.c.h

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

0