-
Notifications
You must be signed in to change notification settings - Fork 24.2k
☂️ Many ecosystem libraries started to fail with std::bad_alloc
after Nov 1st
#140590
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Latest build failure with torchrec: https://github.com/pytorch/torchrec/actions/runs/11897072317/job/33150462304?pr=2567#step:14:5224 |
If it's limited to the JIT script, one needs to check that parts of libstdc++ runtime are not linked separately for libtorch_cpu and libtorch_python |
This is the error:
|
Per @davidberard98 this issues is probably related: #140423 |
https://github.com/pytorch-labs/tritonbench/actions/runs/12055925374/job/33617344431 |
@malfet could you please provide more details, what exactly we should check ? ldd command will probably help here. |
Could this PR caused the failure: #127936 ? Here is the PR where I reverted some of the changes landed on Nov 2 nightly: #141782 . Looks like after revert of #127936 I can't repro the std:bad_alloc anymore with repro I use executing this workfow: https://github.com/pytorch/vision/blob/main/.github/workflows/docs.yml |
looks like the issue comes back again: https://github.com/pytorch/ao/actions/runs/12123456248/job/33798989902 after we migrate to linux_job_v2: pytorch/ao#1302 |
…d-sort (#127936) (#141901) Looks like the original PR caused: #140590 Please see comment: #140590 (comment) Pull Request resolved: #141901 Approved by: https://github.com/andrewor14, https://github.com/malfet
Revert landed in Dec 3 nightly, closing this issue. |
…d-sort (pytorch#127936) (pytorch#141901) Looks like the original PR caused: pytorch#140590 Please see comment: pytorch#140590 (comment) Pull Request resolved: pytorch#141901 Approved by: https://github.com/andrewor14, https://github.com/malfet
Hi @atalman, since this failure only caught by CI in Torch Library, do you have any suggestion to add some accept testing to PyTorch Core CI to gate this kind of issue? |
🐛 Describe the bug
See:
Versions
CI
cc @ezyang @gchanan @zou3519 @kadeng @msaroufim
The text was updated successfully, but these errors were encountered: