@@ -162,12 +162,12 @@ the :option:`!-X` option takes precedence over the environment variable.
162
162
163
163
Example, using the environment variable::
164
164
165
- $ PYTHONPERFSUPPORT=1 python script.py
165
+ $ PYTHONPERFSUPPORT=1 perf record -F 9999 -g -o perf.data python script.py
166
166
$ perf report -g -i perf.data
167
167
168
168
Example, using the :option: `!-X ` option::
169
169
170
- $ python -X perf script.py
170
+ $ perf record -F 9999 -g -o perf.data python -X perf script.py
171
171
$ perf report -g -i perf.data
172
172
173
173
Example, using the :mod: `sys ` APIs in file :file: `example.py `:
@@ -184,7 +184,7 @@ Example, using the :mod:`sys` APIs in file :file:`example.py`:
184
184
185
185
...then::
186
186
187
- $ python ./example.py
187
+ $ perf record -F 9999 -g -o perf.data python ./example.py
188
188
$ perf report -g -i perf.data
189
189
190
190
@@ -210,31 +210,57 @@ of ``perf``.
210
210
How to work without frame pointers
211
211
----------------------------------
212
212
213
- If you are working with a Python interpreter that has been compiled without frame pointers
214
- you can still use the ``perf `` profiler but the overhead will be a bit higher because Python
215
- needs to generate unwinding information for every Python function call on the fly. Additionally,
216
- ``perf `` will take more time to process the data because it will need to use the DWARF debugging
217
- information to unwind the stack and this is a slow process.
213
+ If you are working with a Python interpreter that has been compiled without
214
+ frame pointers, you can still use the ``perf `` profiler, but the overhead will be
215
+ a bit higher because Python needs to generate unwinding information for every
216
+ Python function call on the fly. Additionally, ``perf `` will take more time to
217
+ process the data because it will need to use the DWARF debugging information to
218
+ unwind the stack and this is a slow process.
218
219
219
- To enable this mode, you can use the environment variable :envvar: `PYTHON_PERF_JIT_SUPPORT ` or the
220
- :option: `-X perf_jit <-X> ` option, which will enable the JIT mode for the ``perf `` profiler.
220
+ To enable this mode, you can use the environment variable
221
+ :envvar: `PYTHON_PERF_JIT_SUPPORT ` or the :option: `-X perf_jit <-X> ` option,
222
+ which will enable the JIT mode for the ``perf `` profiler.
221
223
222
- When using the perf JIT mode, you need an extra step before you can run ``perf report ``. You need to
223
- call the ``perf inject `` command to inject the JIT information into the ``perf.data `` file.
224
+ .. note ::
225
+
226
+ Due to a bug in the ``perf `` tool, only ``perf `` versions higher than v6.8
227
+ will work with the JIT mode. The fix was also backported to the v6.7.2
228
+ version of the tool.
229
+
230
+ Note that when checking the version of the ``perf `` tool (which can be done
231
+ by running ``perf version ``) you must take into account that some distros
232
+ add some custom version numbers including a ``- `` character. This means
233
+ that ``perf 6.7-3 `` is not necessarily ``perf 6.7.3 ``.
234
+
235
+ When using the perf JIT mode, you need an extra step before you can run ``perf
236
+ report ``. You need to call the ``perf inject `` command to inject the JIT
237
+ information into the ``perf.data `` file.::
224
238
225
239
$ perf record -F 9999 -g --call-graph dwarf -o perf.data python -Xperf_jit my_script.py
226
- $ perf inject -i perf.data --jit
227
- $ perf report -g -i perf.data
240
+ $ perf inject -i perf.data --jit --output perf.jit.data
241
+ $ perf report -g -i perf.jit. data
228
242
229
243
or using the environment variable::
230
244
231
245
$ PYTHON_PERF_JIT_SUPPORT=1 perf record -F 9999 -g --call-graph dwarf -o perf.data python my_script.py
232
- $ perf inject -i perf.data --jit
233
- $ perf report -g -i perf.data
234
-
235
- Notice that when using ``--call-graph dwarf `` the ``perf `` tool will take snapshots of the stack of
236
- the process being profiled and save the information in the ``perf.data `` file. By default the size of
237
- the stack dump is 8192 bytes but the user can change the size by passing the size after comma like
238
- ``--call-graph dwarf,4096 ``. The size of the stack dump is important because if the size is too small
239
- ``perf `` will not be able to unwind the stack and the output will be incomplete.
246
+ $ perf inject -i perf.data --jit --output perf.jit.data
247
+ $ perf report -g -i perf.jit.data
248
+
249
+ ``perf inject --jit `` command will read ``perf.data ``,
250
+ automatically pick up the perf dump file that Python creates (in
251
+ ``/tmp/perf-$PID.dump ``), and then create ``perf.jit.data `` which merges all the
252
+ JIT information together. It should also create a lot of ``jitted-XXXX-N.so ``
253
+ files in the current directory which are ELF images for all the JIT trampolines
254
+ that were created by Python.
255
+
256
+ .. warning ::
257
+ Notice that when using ``--call-graph dwarf `` the ``perf `` tool will take
258
+ snapshots of the stack of the process being profiled and save the
259
+ information in the ``perf.data `` file. By default the size of the stack dump
260
+ is 8192 bytes but the user can change the size by passing the size after
261
+ comma like ``--call-graph dwarf,4096 ``. The size of the stack dump is
262
+ important because if the size is too small ``perf `` will not be able to
263
+ unwind the stack and the output will be incomplete. On the other hand, if
264
+ the size is too big, then ``perf `` won't be able to sample the process as
265
+ frequently as it would like as the overhead will be higher.
240
266
0 commit comments