diff --git a/.github/CODEOWNERS b/.github/CODEOWNERS index 70d39f0dc19..66bdde3b7f3 100644 --- a/.github/CODEOWNERS +++ b/.github/CODEOWNERS @@ -524,6 +524,7 @@ pep-0665.rst @brettcannon # pep-0666.txt pep-0667.rst @markshannon pep-0668.rst @dstufft +pep-0669.rst @markshannon pep-0670.rst @vstinner @erlend-aasland pep-0671.rst @rosuav pep-0672.rst @encukou diff --git a/pep-0669.rst b/pep-0669.rst new file mode 100644 index 00000000000..45dc9955824 --- /dev/null +++ b/pep-0669.rst @@ -0,0 +1,411 @@ +PEP: 669 +Title: Low Impact Monitoring for CPython +Author: Mark Shannon +Status: Draft +Type: Standards Track +Content-Type: text/x-rst +Created: 18-Aug-2021 +Post-History: 7-Dec-2021 + + +Abstract +======== + +Using a profiler or debugger in CPython can have a severe impact on +performance. Slowdowns by an order of magnitude are common. + +This PEP proposes an API for monitoring of Python programs running +on CPython that will enable monitoring at low cost. + +Although this PEP does not specify an implementation, it is expected that +it will be implemented using the quickening step of PEP 659 [1]_. + +Motivation +========== + +Developers should not have to pay an unreasonable cost to use debuggers, +profilers and other similar tools. + +C++ and Java developers expect to be able to run a program at full speed +(or very close to it) under a debugger. +Python developers should expect that too. + +Rationale +========= + +The quickening mechanism provided by PEP 659 provides a way to dynamically +modify executing Python bytecode. These modifications have little cost beyond +the parts of the code that are modified and a relatively low cost to those +parts that are modified. We can leverage this to provide an efficient +mechanism for monitoring that was not possible in 3.10 or earlier. + +By using quickening, we expect that code run under a debugger on 3.11 +should easily outperform code run without a debugger on 3.10. +Profiling will still slow down execution, but by much less than in 3.10. + + +Specification +============= + +Monitoring of Python programs is done by registering callback functions +for events and by activating a set of events. + +Activating events and registering callback functions are independent of each other. + +Events +------ + +As a code object executes various events occur that might be of interest +to tools. By activating events and by registering callback functions +tools can respond to these events in any way that suits them. +Events can be set globally, or for individual code objects. + +For 3.11, CPython will support the following events: + +* PY_CALL: Call of a Python function (occurs immediately after the call, the callee's frame will be on the stack) +* PY_RESUME: Resumption of a Python function (for generator and coroutine functions), except for throw() calls. +* PY_THROW: A Python function is resumed by a throw() call. +* PY_RETURN: Return from a Python function (occurs immediately before the return, the callee's frame will be on the stack). +* PY_YIELD: Yield from a Python function (occurs immediately before the yield, the callee's frame will be on the stack). +* PY_UNWIND: Exit from a Python function during exception unwinding. +* C_CALL: Call of a builtin function (before the call in this case). +* C_RETURN: Return from a builtin function (after the return in this case). +* RAISE: An exception is raised. +* EXCEPTION_HANDLED: An exception is handled. +* LINE: An instruction is about to be executed that has a different line number from the preceding instruction. +* INSTRUCTION -- A VM instruction is about to be executed. +* JUMP -- An unconditional jump in the control flow graph is reached. +* BRANCH -- A conditional branch is about to be taken (or not). +* MARKER -- A marker is hit + +More events may be added in the future. + +All event codes are integer powers of two and can be bitwise or-ed together to +activate multiple events. + +Setting events globally +----------------------- + +Events can be controlled globally by modifying the set of events being monitored: + +* ``sys.get_monitoring_events()->int`` + Returns the ``int`` resulting from bitwise-oring all the active events. + +* ``sys.set_monitoring_events(event_set: int)`` + Activates all events which are set in ``event_set``. + +No events are active by default. + +Per code object events +---------------------- + +Events can also be controlled on a per code object basis: + +* ``sys.get_local_monitoring_events(code: CodeType)->int`` + Returns the ``int`` resulting from bitwise-oring all the local events for ``code`` + +* ``sys.set_local_monitoring_events(code: CodeType, event_set: int)`` + Returns the ``int`` resulting from bitwise-oring all the local events for ``code`` + +Local events add to global events, but do not mask them. +In other words, all global events will trigger for a code object, regardless of the local events. + + +Register callback functions +--------------------------- + +To register a callable for events call:: + + sys.register_monitoring_callback(event, func) + +Functions can be unregistered by calling +``sys.register_monitoring_callback(event, None)``. + +Callback functions can be registered and unregistered at any time. + +Registering a callback function will generate a ``sys.audit`` event. + +Callback function arguments +''''''''''''''''''''''''''' + +When an active event occurs, the registered callback function is called. +Different events will provide the callback function with different arguments, as follows: + +* All events starting with ``PY_``: + + ``func(code: CodeType, instruction_offset: int)`` + +* ``C_CALL`` and ``C_RETURN``: + + ``func(code: CodeType, instruction_offset: int, callable: object)`` + +* ``RAISE`` and ``EXCEPTION_HANDLED``: + + ``func(code: CodeType, instruction_offset: int, exception: BaseException)`` + +* ``LINE``: + + ``func(code: CodeType, line_number: int)`` + +* ``JUMP`` and ``BRANCH``: + + ``func(code: CodeType, instruction_offset: int, destination_offset: int)`` + + Note that the ``destination_offset`` is where the code will next execute. + For an untaken branch this will be the offset of the instruction following + the branch. + +* ``INSTRUCTION``: + + ``func(code: CodeType, instruction_offset: int)`` + +* ``MARKER``: + + ``func(code: CodeType, instruction_offset: int, marker_id: int)`` + +Inserting and removing markers +'''''''''''''''''''''''''''''''''' + +Two new functions are added to the ``sys`` module to support markers. + +* ``sys.insert_marker(code: CodeType, offset: int, marker_id=0: range(256))`` +* ``sys.remove_marker(code: CodeType, offset: int)`` + +The ``marker_id`` has no meaning to the VM, +and is used only as an argument to the callback function. +The ``marker_id`` must in the range 0 to 255 (inclusive). + +List of new functions +''''''''''''''''''''' + +* ``sys.get_monitoring_events()->int`` +* ``sys.set_monitoring_events(event_set: int)`` +* ``sys.get_local_monitoring_events(code: CodeType)->int`` +* ``sys.set_local_monitoring_events(code: CodeType, event_set: int)`` +* ``sys.register_monitoring_callback(event: int, func: Callable)`` +* ``sys.insert_marker(code: CodeType, offset: int, marker_id=0: range(256))`` +* ``sys.remove_marker(code: CodeType, offset: int)`` + +Backwards Compatibility +======================= + +This PEP is fully backwards compatible, in the sense that old code +will work if the features of this PEP are unused. + +However, if it is used it will effectively disable ``sys.settrace``, +``sys.setprofile`` and PEP 523 frame evaluation. + +If PEP 523 is in use, or ``sys.settrace`` or ``sys.setprofile`` has been +set, then calling ``sys.set_monitoring_events()`` or +``sys.set_local_monitoring_events()`` will raise an exception. + +Likewise, if ``sys.set_monitoring_events()`` or +``sys.set_local_monitoring_events()`` has been called, then using PEP 523 +or calling ``sys.settrace`` or ``sys.setprofile`` will raise an exception. + +This PEP is incompatible with ``sys.settrace`` and ``sys.setprofile`` +because the implementation of ``sys.settrace`` and ``sys.setprofile`` +will use the same underlying mechanism as this PEP. It would be too slow +to support both the new and old monitoring mechanisms at the same time, +and they would interfere in awkward ways if both were active at the same time. + +This PEP is incompatible with PEP 523, because PEP 523 prevents the VM being +able to modify the code objects of executing code, which is a necessary feature. + +We may seek to remove ``sys.settrace`` and PEP 523 in the future once the APIs +provided by this PEP have been widely adopted, but that is for another PEP. + +Performance +----------- + +If no events are active, this PEP should have a negligible impact on +performance. + +If a small set of events are active, e.g. for a debugger, then the overhead +of callbacks will be orders of magnitudes less than for ``sys.settrace`` and +much cheaper than using PEP 523. + +For heavily instrumented code, e.g. using ``LINE``, performance should be +better than ``sys.settrace``, but not by that much as performance will be +dominated by the time spent in callbacks. + +For optimizing virtual machines, such as future versions of CPython +(and ``PyPy`` should they choose to support this API), changing the set of +globally active events in the midst of a long running program could be quite +expensive, possibly taking hundreds of milliseconds as it triggers +de-optimizations. Once such de-optimization has occurred, performance should +recover as the VM can re-optimize the instrumented code. + +Security Implications +===================== + +Allowing modification of running code has some security implications, +but no more than the ability to generate and call new code. + +All the new functions listed above will trigger audit hooks. + +Implementation +============== + +This outlines the proposed implementation for CPython 3.11. The actual +implementation for later versions of CPython and other Python implementations +may differ considerably. + +The proposed implementation of this PEP will be built on top of the quickening +step of PEP 659 [1]_. Activating some events will cause all code objects to +be quickened before they are executed. + +For example, if the ``LINE`` event is turned on, then all instructions that +are at the start of a line will be replaced with a ``LINE_EVENT`` instruction. + +Note that this will interfere with specialization, which will result in some +performance degradation in addition to the overhead of calling the +registered callable. + +When the set of active events changes, the VM will immediately update +all code objects present on the call stack of any thread. It will also set in +place traps to ensure that all code objects are correctly instrumented when +called. Consequently changing the set of active events should be done as +infrequently as possible, as it could be quite an expensive operation. + +Other events, such as ``RAISE`` can be turned on or off cheaply, +as they do not rely on code instrumentation, but runtime checks when the +underlying event occurs. + +The exact set of events that require instrumentation is an implementation detail, +but for the current design, the following events will require instrumentation: + +* PY_CALL +* PY_RESUME +* PY_RETURN +* PY_YIELD +* C_CALL +* C_RETURN +* LINE +* INSTRUCTION +* JUMP +* BRANCH + +Implementing tools +================== + +It is the philosophy of this PEP that it should be possible for third-party monitoring +tools to achieve high-performance, not that it should be easy for them to do so. + +Converting events into data that is meaningful to the users is +the responsibility of the tool. + +All events have a cost, and tools should attempt to the use set of events +that trigger the least often and still provide the necessary information. + +Debuggers +--------- + +Inserting breakpoints +''''''''''''''''''''' + +Breakpoints can be inserted by using markers. For example:: + + sys.insert_marker(code, offset) + +Which will insert a marker at ``offset`` in ``code``, +which can be used as a breakpoint. + +To insert a breakpoint at a given line, the matching instruction offsets +should be found from ``code.co_lines()``. + +Breakpoints can be removed by removing the marker:: + + sys.remove_marker(code, offset) + +Stepping +'''''''' + +Debuggers usually offer the ability to step execution by a +single instruction or line. + +This can be implemented by inserting a new marker at the required +offset(s) of the code to be stepped to, +and by removing the current marker. + +It is the job of the debugger to compute the relevant offset(s). + +Attaching +''''''''' + +Debuggers can use the ``PY_CALL``, etc. events to be informed when +a code object is first encountered, so that any necessary breakpoints +can be inserted. + + +Coverage Tools +-------------- + +Coverage tools need to track which parts of the control graph have been +executed. To do this, they need to register for the ``PY_`` events, +plus ``JUMP`` and ``BRANCH``. + +This information can be then be converted back into a line based report +after execution has completed. + +Profilers +--------- + +Simple profilers need to gather information about calls. +To do this profilers should register for the following events: + +* PY_CALL +* PY_RESUME +* PY_THROW +* PY_RETURN +* PY_YIELD +* PY_UNWIND +* C_CALL +* C_RETURN + + +Line based profilers +'''''''''''''''''''' + +Line based profilers can use the ``LINE`` and ``JUMP`` events. +Implementers of profilers should be aware that instrumenting ``LINE`` +and ``JUMP`` events will have a large impact on performance. + +.. note:: + + Instrumenting profilers have significant overhead and will distort + the results of profiling. Unless you need exact call counts, + consider using a statistical profiler. + + +Rejected ideas +============== + +A draft version of this PEP proposed making the user responsible +for inserting the monitoring instructions, rather than have VM do it. +However, that puts too much of a burden on the tools, and would make +attaching a debugger nearly impossible. + +References +========== + +.. [1] Quickening in PEP 659 + https://www.python.org/dev/peps/pep-0659/#quickening + + + +Copyright +========= + +This document is placed in the public domain or under the +CC0-1.0-Universal license, whichever is more permissive. + + +.. + Local Variables: + mode: indented-text + indent-tabs-mode: nil + sentence-end-double-space: t + fill-column: 70 + coding: utf-8 + End: