8000 gh-96143: Improve perf profiler docs (#96445) · python/cpython@723ebe7 · GitHub
[go: up one dir, main page]

Skip to content

Commit 723ebe7

Browse files
gh-96143: Improve perf profiler docs (#96445)
1 parent 22863df commit 723ebe7

File tree

6 files changed

+116
-48
lines changed

6 files changed

+116
-48
lines changed

Doc/howto/perf_profiling.rst

Lines changed: 38 additions & 31 deletions
Original file line numberDiff line numberDiff line change
@@ -8,10 +8,11 @@ Python support for the Linux ``perf`` profiler
88

99
:author: Pablo Galindo
1010

11-
The Linux ``perf`` profiler is a very powerful tool that allows you to profile and
12-
obtain information about the performance of your application. ``perf`` also has
13-
a very vibrant ecosystem of tools that aid with the analysis of the data that it
14-
produces.
11+
`The Linux perf profiler <https://perf.wiki.kernel.org>`_
12+
is a very powerful tool that allows you to profile and obtain
13+
information about the performance of your application.
14+
``perf`` also has a very vibrant ecosystem of tools
15+
that aid with the analysis of the data that it produces.
1516

1617
The main problem with using the ``perf`` profiler with Python applications is that
1718
``perf`` only allows to get information about native symbols, this is, the names of
@@ -25,7 +26,7 @@ fly before the execution of every Python function and it will teach ``perf`` the
2526
relationship between this piece of code and the associated Python function using
2627
`perf map files`_.
2728

28-
.. warning::
29+
.. note::
2930

3031
Support for the ``perf`` profiler is only currently available for Linux on
3132
selected architectures. Check the output of the configure build step or
@@ -51,11 +52,11 @@ For example, consider the following script:
5152
if __name__ == "__main__":
5253
baz(1000000)
5354
54-
We can run perf to sample CPU stack traces at 9999 Hertz:
55+
We can run ``perf`` to sample CPU stack traces at 9999 Hertz::
5556

5657
$ perf record -F 9999 -g -o perf.data python my_script.py
5758

58-
Then we can use perf report to analyze the data:
59+
Then we can use ``perf`` report to analyze the data:
5960

6061
.. code-block:: shell-session
6162
@@ -101,7 +102,7 @@ As you can see here, the Python functions are not shown in the output, only ``_P
101102
functions use the same C function to evaluate bytecode so we cannot know which Python function corresponds to which
102103
bytecode-evaluating function.
103104

104-
Instead, if we run the same experiment with perf support activated we get:
105+
Instead, if we run the same experiment with ``perf`` support enabled we get:
105106

106107
.. code-block:: shell-session
107108
@@ -147,52 +148,58 @@ Instead, if we run the same experiment with perf support activated we get:
147148
148149
149150
150-
Enabling perf profiling mode
151-
----------------------------
151+
How to enable ``perf`` profiling support
152+
----------------------------------------
152153

153-
There are two main ways to activate the perf profiling mode. If you want it to be
154-
active since the start of the Python interpreter, you can use the ``-Xperf`` option:
154+
``perf`` profiling support can either be enabled from the start using
155+
the environment variable :envvar:`PYTHONPERFSUPPORT` or the
156+
:option:`-X perf <-X>` option,
157+
or dynamically using :func:`sys.activate_stack_trampoline` and
158+
:func:`sys.deactivate_stack_trampoline`.
155159

156-
$ python -Xperf my_script.py
160+
The :mod:`!sys` functions take precedence over the :option:`!-X` option,
161+
the :option:`!-X` option takes precedence over the environment variable.
157162

158-
You can also set the :envvar:`PYTHONPERFSUPPORT` to a nonzero value to actiavate perf
159-
profiling mode globally.
163+
Example, using the environment variable::
160164

161-
There is also support for dynamically activating and deactivating the perf
162-
profiling mode by using the APIs in the :mod:`sys` module:
165+
$ PYTHONPERFSUPPORT=1
166+
$ python script.py
167+
$ perf report -g -i perf.data
163168

164-
.. code-block:: python
165-
166-
import sys
167-
sys.activate_stack_trampoline("perf")
169+
Example, using the :option:`!-X` option::
168170

169-
# Run some code with Perf profiling active
171+
$ python -X perf script.py
172+
$ perf report -g -i perf.data
170173

171-
sys.deactivate_stack_trampoline()
174+
Example, using the :mod:`sys` APIs in file :file:`example.py`:
172175

173-
# Perf profiling is not active anymore
176+
.. code-block:: python
174177
175-
These APIs can be handy if you want to activate/deactivate profiling mode in
176-
response to a signal or other communication mechanism with your process.
178+
import sys
177179
180+
sys.activate_stack_trampoline("perf")
181+
do_profiled_stuff()
182+
sys.deactivate_stack_trampoline()
178183
184+
non_profiled_stuff()
179185
180-
Now we can analyze the data with ``perf report``:
186+
...then::
181187

182-
$ perf report -g -i perf.data
188+
$ python ./example.py
189+
$ perf report -g -i perf.data
183190

184191

185192
How to obtain the best results
186-
-------------------------------
193+
------------------------------
187194

188195
For the best results, Python should be compiled with
189196
``CFLAGS="-fno-omit-frame-pointer -mno-omit-leaf-frame-pointer"`` as this allows
190197
profilers to unwind using only the frame pointer and not on DWARF debug
191-
information. This is because as the code that is interposed to allow perf
198+
information. This is because as the code that is interposed to allow ``perf``
192199
support is dynamically generated it doesn't have any DWARF debugging information
193200
available.
194201

195-
You can check if you system has been compiled with this flag by running:
202+
You can check if your system has been compiled with this flag by running::
196203

197204
$ python -m sysconfig | grep 'no-omit-frame-pointer'
198205

Doc/library/sys.rst

Lines changed: 32 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1555,6 +1555,38 @@ always available.
15551555
This function has been added on a provisional basis (see :pep:`411`
15561556
for details.) Use it only for debugging purposes.
15571557

1558+
.. function:: activate_stack_trampoline(backend, /)
1559+
1560+
Activate the stack profiler trampoline *backend*.
1561+
The only supported backend is ``"perf"``.
1562+
1563+
.. availability:: Linux.
1564+
1565+
.. versionadded:: 3.12
1566+
1567+
.. seealso::
1568+
1569+
* :ref:`perf_profiling`
1570+
* https://perf.wiki.kernel.org
1571+
1572+
.. function:: deactivate_stack_trampoline()
1573+
1574+
Deactivate the current stack profiler trampoline backend.
1575+
1576+
If no stack profiler is activated, this function has no effect.
1577+
1578+
.. availability:: Linux.
1579+
1580+
.. versionadded:: 3.12
1581+
1582+
.. function:: is_stack_trampoline_active()
1583+
1584+
Return ``True`` if a stack profiler trampoline is active.
1585+
1586+
.. availability:: Linux.
1587+
1588+
.. versionadded:: 3.12
1589+
15581590
.. function:: _enablelegacywindowsfsencoding()
15591591

15601592
Changes the :term:`filesystem encoding and error handler` to 'mbcs' and

Doc/using/cmdline.rst

Lines changed: 10 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -538,12 +538,11 @@ Miscellaneous options
538538
development (running from the source tree) then the default is "off".
539539
Note that the "importlib_bootstrap" and "importlib_bootstrap_external"
540540
frozen modules are always used, even if this flag is set to "off".
541-
* ``-X perf`` to activate compatibility mode with the ``perf`` profiler.
542-
When this option is activated, the Linux ``perf`` profiler will be able to
541+
* ``-X perf`` enables support for the Linux ``perf`` profiler.
542+
When this option is provided, the ``perf`` profiler will be able to
543543
report Python calls. This option is only available on some platforms and
544544
will do nothing if is not supported on the current system. The default value
545-
is "off". See also :envvar:`PYTHONPERFSUPPORT` and :ref:`perf_profiling`
546-
for more information.
545+
is "off". See also :envvar:`PYTHONPERFSUPPORT` and :ref:`perf_profiling`.
547546

548547
It also allows passing arbitrary values and retrieving them through the
549548
:data:`sys._xoptions` dictionary.
@@ -1048,9 +1047,13 @@ conflict.
10481047

10491048
.. envvar:: PYTHONPERFSUPPORT
10501049

1051-
If this variable is set to a nonzero value, it activates compatibility mode
1052-
with the ``perf`` profiler so Python calls can be detected by it. See the
1053-
:ref:`perf_profiling` section for more information.
1050+
If this variable is set to a nonzero value, it enables support for
1051+
the Linux ``perf`` profiler so Python calls can be detected by it.
1052+
1053+
If set to ``0``, disable Linux ``perf`` profiler support.
1054+
1055+
See also the :option:`-X perf <-X>` command-line option
1056+
and :ref:`perf_profiling`.
10541057

10551058
.. versionadded:: 3.12
10561059

Doc/whatsnew/3.12.rst

Lines changed: 22 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -74,6 +74,15 @@ Important deprecations, removals or restrictions:
7474
New Features
7575
============
7676

77+
* Add :ref:`perf_profiling` through the new
78+
environment variable :envvar:`PYTHONPERFSUPPORT`,
79+
the new command-line option :option:`-X perf <-X>`,
80+
as well as the new :func:`sys.activate_stack_trampoline`,
81+
:func:`sys.deactivate_stack_trampoline`,
82+
and :func:`sys.is_stack_trampoline_active` APIs.
83+
(Design by Pablo Galindo. Contributed by Pablo Galindo and Christian Heimes
84+
with contributions from Gregory P. Smith [Google] and Mark Shannon
85+
in :gh:`96123`.)
7786

7887

7988
Other Language Changes
@@ -194,6 +203,19 @@ tempfile
194203
The :class:`tempfile.NamedTemporaryFile` function has a new optional parameter
195204
*delete_on_close* (Contributed by Evgeny Zorin in :gh:`58451`.)
196205

206+
sys
207+
---
208+
209+
* Add :func:`sys.activate_stack_trampoline` and
210+
:func:`sys.deactivate_stack_trampoline` for activating and deactivating
211+
stack profiler trampolines,
212+
and :func:`sys.is_stack_trampoline_active` for querying if stack profiler
213+
trampolines are active.
214+
(Contributed by Pablo Galindo and Christian Heimes
215+
with contributions from Gregory P. Smith [Google] and Mark Shannon
216+
in :gh:`96123`.)
217+
218+
197219
Optimizations
198220
=============
199221

Python/clinic/sysmodule.c.h

Lines changed: 6 additions & 4 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

Python/sysmodule.c

Lines changed: 8 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -2127,12 +2127,12 @@ sys.activate_stack_trampoline
21272127
backend: str
21282128
/
21292129
2130-
Activate the perf profiler trampoline.
2130+
Activate stack profiler trampoline *backend*.
21312131
[clinic start generated code]*/
21322132

21332133
static PyObject *
21342134
sys_activate_stack_trampoline_impl(PyObject *module, const char *backend)
2135-
/*[clinic end generated code: output=5783cdeb51874b43 input=b09020e3a17c78c5]*/
2135+
/*[clinic end generated code: output=5783cdeb51874b43 input=a12df928758a82b4]*/
21362136
{
21372137
#ifdef PY_HAVE_PERF_TRAMPOLINE
21382138
if (strcmp(backend, "perf") == 0) {
@@ -2163,12 +2163,14 @@ sys_activate_stack_trampoline_impl(PyObject *module, const char *backend)
21632163
/*[clinic input]
21642164
sys.deactivate_stack_trampoline
21652165
2166-
Dectivate the perf profiler trampoline.
2166+
Deactivate the current stack profiler trampoline backend.
2167+
2168+
If no stack profiler is activated, this function has no effect.
21672169
[clinic start generated code]*/
21682170

21692171
static PyObject *
21702172
sys_deactivate_stack_trampoline_impl(PyObject *module)
2171-
/*[clinic end generated code: output=b50da25465df0ef1 input=491f4fc1ed615736]*/
2173+
/*[clinic end generated code: output=b50da25465df0ef1 input=9f629a6be9fe7fc8]*/
21722174
{
21732175
if (_PyPerfTrampoline_Init(0) < 0) {
21742176
return NULL;
@@ -2179,12 +2181,12 @@ sys_deactivate_stack_trampoline_impl(PyObject *module)
21792181
/*[clinic input]
21802182
sys.is_stack_trampoline_active
21812183
2182-
Returns *True* if the perf profiler trampoline is active.
2184+
Return *True* if a stack profiler trampoline is active.
21832185
[clinic start generated code]*/
21842186

21852187
static PyObject *
21862188
sys_is_stack_trampoline_active_impl(PyObject *module)
2187-
/*[clinic end generated code: output=ab2746de0ad9d293 input=061fa5776ac9dd59]*/
2189+
/*[clinic end generated code: output=ab2746de0ad9d293 input=29616b7bf6a0b703]*/
21882190
{
21892191
#ifdef PY_HAVE_PERF_TRAMPOLINE
21902192
if (_PyIsPerfTrampolineActive()) {

0 commit comments

Comments
 (0)
0