8000 Better uop coverage in the JIT optimizer · Issue #131798 · python/cpython · GitHub
[go: up one dir, main page]

Skip to content

Better uop coverage in the JIT optimizer #131798

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
brandtbucher opened this issue Mar 27, 2025 · 22 comments
Open

Better uop coverage in the JIT optimizer #131798

brandtbucher opened this issue Mar 27, 2025 · 22 comments
Assignees
Labels
interpreter-core (Objects, Python, Grammar, and Parser dirs) performance Performance or resource usage topic-JIT type-feature A feature request or enhancement

Comments

@brandtbucher
Copy link
Member
brandtbucher commented Mar 27, 2025

Out of 263 total uops, 155 of these are ignored by the tier two optimizer. These represent over half of all uops by dynamic execution count.

This issue will serve as a checklist for auditing these missing uops, and adding them where they make sense. At first glance, there's quite a bit of potential here... especially around ability to narrow known output types (like _CONTAINS_OP_SET), and the ability to narrow and remove guards on input types (like _BINARY_OP_SUBSCR_LIST_INT). As I'm going through, I'll cross out anything that doesn't seem like it makes sense to add.

First, here are the 53 missing uops that each represent at least 0.1% of all uops executed:

  • _SET_IP (12.1%)
  • _CHECK_VALIDITY (10.1%)
  • _CHECK_VALIDITY_AND_SET_IP (6.5%)
  • _CHECK_PERIODIC (3.1%)
  • _MAKE_WARM (2.8%)
  • _START_EXECUTOR (1.7%)
  • _GUARD_NOS_INT (1.5%)
  • _BINARY_OP_SUBSCR_LIST_INT (1.0%)
  • _CHECK_FUNCTION (1.0%)
  • _CHECK_MANAGED_OBJECT_HAS_VALUES (0.7%)
  • _ITER_CHECK_LIST (0.7%)
  • _CONTAINS_OP_SET (0.6%)
  • _FOR_ITER_TIER_TWO (0.6%)
  • _GUARD_NOT_EXHAUSTED_LIST (0.6%)
  • _ITER_NEXT_LIST_TIER_TWO (0.6%)
  • _SAVE_RETURN_OFFSET (0.6%)
  • _CALL_LEN (0.5%)
  • _CALL_LIST_APPEND (0.5%)
  • _POP_TOP (0.5%)
  • _RESUME_CHECK (0.5%)
  • _BINARY_OP_SUBSCR_STR_INT (0.4%)
  • _GUARD_DORV_VALUES_INST_ATTR_FROM_DICT (0.4%)
  • _GUARD_KEYS_VERSION (0.4%)
  • _BINARY_OP_SUBSCR_DICT (0.3%)
  • _CALL_BUILTIN_FAST (0.3%)
  • _CHECK_STACK_SPACE_OPERAND (0.3%)
  • _GET_ITER (0.3%)
  • _STORE_SUBSCR (0.3%)
  • _GUARD_NOT_EXHAUSTED_RANGE (0.2%)
  • _BINARY_SLICE (0.2%)
  • _BUILD_LIST (0.2%)
  • _CALL_BUILTIN_O (0.2%)
  • _CALL_NON_PY_GENERAL (0.2%)
  • _CHECK_IS_NOT_PY_CALLABLE (0.2%)
  • _GUARD_NOS_FLOAT (0.2%)
  • _ITER_CHECK_RANGE (0.2%)
  • _ITER_CHECK_TUPLE (0.2%)
  • _LOAD_DEREF (0.2%)
  • _STORE_SUBSCR_LIST_INT (0.2%)
  • _BINARY_OP_EXTEND (0.1%)
  • _CALL_ISINSTANCE (0.1%)
  • _CALL_METHOD_DESCRIPTOR_FAST (0.1%)
  • _CALL_METHOD_DESCRIPTOR_FAST_WITH_KEYWORDS (0.1%)
  • _CALL_METHOD_DESCRIPTOR_NOARGS (0.1%)
  • _CALL_TYPE_1 (0.1%)
  • _CHECK_ATTR_CLASS (0.1%)
  • _CONTAINS_OP_DICT (0.1%)
  • _GUARD_BINARY_OP_EXTEND (0.1%)
  • _GUARD_NOT_EXHAUSTED_TUPLE (0.1%)
  • _ITER_NEXT_TUPLE (0.1%)
  • _LIST_APPEND (0.1%)
  • _STORE_ATTR_SLOT (0.1%)
  • _STORE_SUBSCR_DICT (0.1%)

And here are the 102 missing uops that are less than 0.1%. These are less important, but still may net us some wins on individual benchmarks:

  • _BINARY_OP_SUBSCR_CHECK_FUNC
  • _BINARY_OP_SUBSCR_TUPLE_INT
  • _BUILD_MAP
  • _BUILD_SET
  • _BUILD_SLICE
  • _BUILD_STRING
  • _CALL_BUILTIN_CLASS
  • _CALL_BUILTIN_FAST_WITH_KEYWORDS
  • _CALL_INTRINSIC_1
  • _CALL_INTRINSIC_2
  • _CALL_KW_NON_PY
  • _CALL_METHOD_DESCRIPTOR_O
  • _CALL_STR_1
  • _CALL_TUPLE_1
  • _CHECK_ATTR_METHOD_LAZY_DICT
  • _CHECK_EG_MATCH
  • _CHECK_EXC_MATCH
  • _CHECK_FUNCTION_VERSION_INLINE
  • _CHECK_FUNCTION_VERSION_KW
  • _CHECK_IS_NOT_PY_CALLABLE_KW
  • _CHECK_METHOD_VERSION
  • _CHECK_METHOD_VERSION_KW
  • _CHECK_PERIODIC_IF_NOT_YIELD_FROM
  • _CONVERT_VALUE
  • _COPY_FREE_VARS
  • _DELETE_ATTR
  • _DELETE_DEREF
  • _DELETE_FAST
  • _DELETE_GLOBAL
  • _DELETE_NAME
  • _DELETE_SUBSCR
  • _DEOPT
  • _DICT_MERGE
  • _DICT_UPDATE
  • _END_FOR
  • _END_SEND
  • _ERROR_POP_N
  • _EXIT_INIT_CHECK
  • _EXPAND_METHOD
  • _EXPAND_METHOD_KW
  • _FATAL_ERROR
  • _FORMAT_SIMPLE
  • _FORMAT_WITH_SPEC
  • _GET_AITER
  • _GET_ANEXT
  • _GET_AWAITABLE
  • _GET_LEN
  • _GET_YIELD_FROM_ITER
  • _GUARD_DORV_NO_DICT
  • _GUARD_GLOBALS_VERSION
  • _GUARD_TOS_FLOAT
  • _GUARD_TOS_INT
  • _GUARD_TYPE_VERSION_AND_LOCK
  • _IMPORT_FROM
  • _IMPORT_NAME
  • _IS_NONE
  • _LIST_EXTEND
  • _LOAD_ATTR_NONDESCRIPTOR_NO_DICT
  • _LOAD_ATTR_NONDESCRIPTOR_WITH_VALUES
  • _LOAD_BUILD_CLASS
  • _LOAD_COMMON_CONSTANT
  • _LOAD_FAST_LOAD_FAST
  • _LOAD_FROM_DICT_OR_DEREF
  • _LOAD_GLOBAL
  • _LOAD_GLOBAL_BUILTINS
  • _LOAD_GLOBAL_MODULE
  • _LOAD_LOCALS
  • _LOAD_NAME
  • _LOAD_SUPER_ATTR_ATTR
  • _LOAD_SUPER_ATTR_METHOD
  • _MAKE_CALLARGS_A_TUPLE
  • _MAKE_CELL
  • _MAKE_FUNCTION
  • _MAP_ADD
  • _MATCH_CLASS
  • _MATCH_KEYS
  • _MATCH_MAPPING
  • _MATCH_SEQUENCE
  • _MAYBE_EXPAND_METHOD_KW
  • _NOP
  • _POP_EXCEPT
  • _POP_TWO_LOAD_CONST_INLINE_BORROW
  • _PUSH_EXC_INFO
  • _PUSH_NULL_CONDITIONAL
  • _SETUP_ANNOTATIONS
  • _SET_ADD
  • _SET_FUNCTION_ATTRIBUTE
  • _SET_UPDATE
  • _STORE_ATTR
  • _STORE_ATTR_INSTANCE_VALUE
  • _STORE_ATTR_WITH_HINT
  • _STORE_DEREF
  • _STORE_FAST_LOAD_FAST
  • _STORE_FAST_STORE_FAST
  • _STORE_GLOBAL
  • _STORE_NAME
  • _STORE_SLICE
  • _TIER2_RESUME_CHECK
  • _UNARY_INVERT
  • _UNARY_NEGATIVE
  • _UNPACK_SEQUENCE_LIST
  • _WITH_EXCEPT_START

Linked PRs

@brandtbucher
Copy link
Member Author
brandtbucher commented Apr 1, 2025

@diegorusso is going to add _CALL_LEN.

@brandtbucher
Copy link
Member Author
brandtbucher commented Apr 3, 2025

@fluhus is going to add _BINARY_SLICE.

@brandtbucher
Copy link
Member Author

@Klaus117 is going to improve _TO_BOOL_INT.

@brandtbucher
Copy link
Member Author

@Zheaoli is going to add _CONTAINS_OP_DICT.

Zheaoli added a commit to Zheaoli/cpython that referenced this issue Apr 8, 2025
…CONTAINS_OP_DICT

Signed-off-by: Manjusaka <me@manjusaka.me>
Zheaoli added a commit to Zheaoli/cpython that referenced this issue Apr 8, 2025
…CONTAINS_OP_DICT

Signed-off-by: Manjusaka <me@manjusaka.me>
brandtbucher pushed a commit that referenced this issue Apr 8, 2025
@Zheaoli
Copy link
Contributor
Zheaoli commented Apr 8, 2025

I think I can work on _BINARY_OP_SUBSCR_LIST_INT and _BINARY_OP_SUBSCR_DICT

@brandtbucher
Copy link
Member Author

I think I can work on _BINARY_OP_SUBSCR_LIST_INT and _BINARY_OP_SUBSCR_DICT

Sorry, I already have a branch to do the guards for these (and a couple others) that I was going to up in a minute! I'll tag you for review though.

@brandtbucher
Copy link
Member Author

@tomasr8 is going to add _CALL_STR_1, _CALL_TUPLE_1, and _CALL_TYPE_1.

@Zheaoli, want to take _BUILD_LIST, _BUILD_MAP, _BUILD_SET, _BUILD_SLICE, and _BUILD_STRING? For all but _BUILD_SLICE and _BUILD_STRING, I think we can only set the type of the output (sym_new_type). For _BUILD_SLICE and _BUILD_STRING, we may be able to have a constant output if the items are constant (sym_is_const/sym_get_const/sym_new_const). Maybe one PR for the first three, and separate PRs for _BUILD_SLICE and _BUILD_STRING?

@brandtbucher
Copy link
Member Author

@Zheaoli is going to add _GET_LEN.

@brandtbucher
Copy link
Member Author

@tomasr8 is going to add _CALL_ISINSTANCE.

@brandtbucher
Copy link
Member Author

@diegorusso is going to add _CALL_LIST_APPEND.

Zheaoli added a commit to Zheaoli/cpython that referenced this issue May 8, 2025
Signed-off-by: Manjusaka <me@manjusaka.me>
@Zheaoli
Copy link
Contributor
Zheaoli commented May 20, 2025

@brandtbucher I'm trying to cover more UOP in optimizer. But I'm not sure about the priority of the rest of the UOP. Would you mind giving me some suggestions?

Zheaoli added a commit to Zheaoli/cpython that referenced this issue May 21, 2025
Zheaoli added a commit to Zheaoli/cpython that referenced this issue May 21, 2025
`_LOAD_CONST_INLINE_BORROW`

Signed-off-by: Manjusaka <me@manjusaka.me>

add news

Signed-off-by: Manjusaka <me@manjusaka.me>

add tests

Signed-off-by: Manjusaka <me@manjusaka.me>

fix review idea

Signed-off-by: Manjusaka <me@manjusaka.me>
@brandtbucher
Copy link
Member Author

Sure! I think we can turn _CHECK_METHOD_VERSION into _CHECK_FUNCTION_VERSION_INLINE. Check out the optimizer case for _CHECK_FUNCTION_VERSION to see how it's done. This would work the same way, except you would check for &PyMethod_Type, and use the function from ((PyMethodObject *)callable)->im_func as the operand1.

Let me know if you have any questions.

@Zheaoli
Copy link
Contributor
Zheaoli commented May 22, 2025

Sure! I think we can turn _CHECK_METHOD_VERSION into _CHECK_FUNCTION_VERSION_INLINE. Check out the optimizer case for _CHECK_FUNCTION_VERSION to see how it's done. This would work the same way, except you would check for &PyMethod_Type, and use the function from ((PyMethodObject *)callable)->im_func as the operand1.

Let me know if you have any questions.

Thanks for the suggestion! I might need a little bit time. I think I can draft a PR this week

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
interpreter-core (Objects, Python, Grammar, and Parser dirs) performance Performance or resource usage topic-JIT type-feature A feature request or enhancement
Projects
None yet
Development

No branches or pull requests

6 participants
0