-
Notifications
You must be signed in to change notification settings - Fork 24.2k
Fix static py::object
dangling pointer with py::gil_safe_call_once_and_store
#130341
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/130341
Note: Links to docs will display an error until the docs builds have been completed. ✅ No FailuresAs of commit 9572f70 with merge base b1a00b7 ( This comment was automatically generated by Dr. CI and updates every 15 minutes. |
py::object
memory leak with py::gil_safe_call_once_and_store
py::object
dangling pointer with py::gil_safe_call_once_and_store
199a841
to
49d2f73
Compare
Yep, I encountered this issue while working on dynamo 3.13 enablement. Is this behavior documented anywhere in pybind? It would be great if pybind could support code like static py::object obj = py::module_::import("foo").attr("bar"); |
@pytorchbot merge |
Merge startedYour change will be merged once all checks pass (ETA 0-4 Hours). Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
Merge failedReason: 1 jobs have failed, first few of them are: trunk / libtorch-linux-focal-cuda12.4-py3.7-gcc9-debug / build Details for Dev Infra teamRaised by workflow job |
@pytorchbot merge -f "ci hanging for rocm runners" |
Merge startedYour change will be merged immediately since you used the force (-f) flag, bypassing any CI checks (ETA: 1-5 minutes). Please use Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
internal needs a third-party/pybind update (#130064 got reverted for a similar reason) |
375caef
to
24abc7c
Compare
24abc7c
to
43c7a4f
Compare
d176d9d
to
f00660c
Compare
…_and_store` (pytorch#130341) Fix static `py::object`s with `py::gil_safe_call_once_and_store`. The following code will leak a `py::object` which will call its destructor when shutdown the program. The destructor will call `Py_DECREF(obj.m_ptr)` which may raise a segmentation fault. ```c++ void func() { static py::object obj = py::module_::import("foo").attr("bar"); ... } ``` The correct code is to use raw pointers rather than the instance. ```c++ void func() { static py::object* obj_ptr = new py::object{py::module_::import("foo").attr("bar")}; py::object obj = *obj_ptr; ... } ``` This PR uses the `py::gil_safe_call_once_and_store` function from `pybind11`, which can run arbitrary initialization code only once under the Python GIL thread safely. ```c++ void func() { PYBIND11_CONSTINIT static py::gil_safe_call_once_and_store<py::object> storage; py::object obj = storage .call_once_and_store_result( []() -> py::object { return py::module_::import("foo").attr("bar"); } ) .get_stored(); ... } ``` Pull Request resolved: pytorch#130341 Approved by: https://github.com/ezyang
Sorry for the delay on this one. The internal pybind upgrade necessary to merge this has been a slog :/ |
3dbb655
to
80cd4cd
Compare
80cd4cd
to
9572f70
Compare
…_and_store` (#130341) Fix static `py::object`s with `py::gil_safe_call_once_and_store`. The following code will leak a `py::object` which will call its destructor when shutdown the program. The destructor will call `Py_DECREF(obj.m_ptr)` which may raise a segmentation fault. ```c++ void func() { static py::object obj = py::module_::import("foo").attr("bar"); ... } ``` The correct code is to use raw pointers rather than the instance. ```c++ void func() { static py::object* obj_ptr = new py::object{py::module_::import("foo").attr("bar")}; py::object obj = *obj_ptr; ... } ``` This PR uses the `py::gil_safe_call_once_and_store` function from `pybind11`, which can run arbitrary initialization code only once under the Python GIL thread safely. ```c++ void func() { PYBIND11_CONSTINIT static py::gil_safe_call_once_and_store<py::object> storage; py::object obj = storage .call_once_and_store_result( []() -> py::object { return py::module_::import("foo").attr("bar"); } ) .get_stored(); ... } ``` Pull Request resolved: #130341 Approved by: https://github.com/ezyang
FYI #136370 is a rework of this PR to avoid the internal pybind issue |
This is a modified version of #130341 that preserve support for older pybind version. Pull Request resolved: #136370 Approved by: https://github.com/Skylion007, https://github.com/malfet
…_and_store` (p 8000 ytorch#130341) Fix static `py::object`s with `py::gil_safe_call_once_and_store`. The following code will leak a `py::object` which will call its destructor when shutdown the program. The destructor will call `Py_DECREF(obj.m_ptr)` which may raise a segmentation fault. ```c++ void func() { static py::object obj = py::module_::import("foo").attr("bar"); ... } ``` The correct code is to use raw pointers rather than the instance. ```c++ void func() { static py::object* obj_ptr = new py::object{py::module_::import("foo").attr("bar")}; py::object obj = *obj_ptr; ... } ``` This PR uses the `py::gil_safe_call_once_and_store` function from `pybind11`, which can run arbitrary initialization code only once under the Python GIL thread safely. ```c++ void func() { PYBIND11_CONSTINIT static py::gil_safe_call_once_and_store<py::object> storage; py::object obj = storage .call_once_and_store_result( []() -> py::object { return py::module_::import("foo").attr("bar"); } ) .get_stored(); ... } ``` Pull Request resolved: pytorch#130341 Approved by: https://github.com/ezyang
…atches (#141085) Reland - #139560 As mentioned in #130341, using `static py::object` can lead to segfaults. I suspect this is the reason for the import system error seen internally (https://www.internalfb.com/sevmanager/view/469592). In this PR, I am removing the `static` part. This is fine and also the right thing to do because this will catch if user changes the flag in the same process for compiling two different functions. Unfortunately, there is no easy way to trigger this segfault, so I can't write a test. Pull Request resolved: #141085 Approved by: https://github.com/jansel Co-authored-by: William Wen <williamwen@meta.com>
…atches (#141085) Reland - #139560 As mentioned in #130341, using `static py::object` can lead to segfaults. I suspect this is the reason for the import system error seen internally (https://www.internalfb.com/sevmanager/view/469592). In this PR, I am removing the `static` part. This is fine and also the right thing to do because this will catch if user changes the flag in the same process for compiling two different functions. Unfortunately, there is no easy way to trigger this segfault, so I can't write a test. Pull Request resolved: #141085 Approved by: https://github.com/jansel Co-authored-by: William Wen <williamwen@meta.com>
…atches (#141085) Reland - #139560 As mentioned in #130341, using `static py::object` can lead to segfaults. I suspect this is the reason for the import system error seen internally (https://www.internalfb.com/sevmanager/view/469592). In this PR, I am removing the `static` part. This is fine and also the right thing to do because this will catch if user changes the flag in the same process for compiling two different functions. Unfortunately, there is no easy way to trigger this segfault, so I can't write a test. Pull Request resolved: #141085 Approved by: https://github.com/jansel Co-authored-by: William Wen <williamwen@meta.com>
…atches (pytorch#141085) Reland - pytorch#139560 As mentioned in pytorch#130341, using `static py::object` can lead to segfaults. I suspect this is the reason for the import system error seen internally (https://www.internalfb.com/sevmanager/view/469592). In this PR, I am removing the `static` part. This is fine and also the right thing to do because this will catch if user changes the flag in the same process for compiling two different functions. Unfortunately, there is no easy way to trigger this segfault, so I can't write a test. Pull Request resolved: pytorch#141085 Approved by: https://github.com/jansel Co-authored-by: William Wen <williamwen@meta.com>
…atches (#141085) Reland - #139560 As mentioned in #130341, using `static py::object` can lead to segfaults. I suspect this is the reason for the import system error seen internally (https://www.internalfb.com/sevmanager/view/469592). In this PR, I am removing the `static` part. This is fine and also the right thing to do because this will catch if user changes the flag in the same process for compiling two different functions. Unfortunately, there is no easy way to trigger this segfault, so I can't write a test. Pull Request resolved: #141085 Approved by: https://github.com/jansel Co-authored-by: William Wen <williamwen@meta.com>
Fix static
py::object
s withpy::gil_safe_call_once_and_store
.The following code will leak a
py::object
which will call its destructor when shutdown the program. The destructor will callPy_DECREF(obj.m_ptr)
which may raise a segmentation fault.The correct code is to use raw pointers rather than the instance.
This PR uses the
py::gil_safe_call_once_and_store
function frompybind11
, which can run arbitrary initialization code only once under the Python GIL thread safely.cc @voznesenskym @penguinwu @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @peterbell10 @ipiszy @yf225 @chenyang78 @kadeng @muchulee8 @ColinPeppler @amjames @desertfire @chauhang