-
Notifications
You must be signed in to change notification settings - Fork 25.9k
Fix recently added AllocatorConfig static initializer code. #159607
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/159607
Note: Links to docs will display an error until the docs builds have been completed. ❌ 1 Cancelled Job, 2 Unrelated FailuresAs of commit 614c969 with merge base 490cb3f ( CANCELLED JOB - The following job was cancelled. Please retry:
BROKEN TRUNK - The following job failed but were present on the merge base:👉 Rebase onto the `viable/strict` branch to avoid these failures
UNSTABLE - The following job is marked as unstable, possibly due to flakiness on trunk:
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
|
@pytorchbot label "topic: not user facing" |
|
@ScottTodd Thanks for your help! I’ve modified your code and opened a new PR (#159629) and df9befb3, stacking it to help streamline the landing and validation process. Of course, if you'd prefer to land your original PR, I’m happy to close mine and continue landing the rest of the stack after yours is approved. |
Now that the commits are reverted, your PR stack looks easier to manage than this standalone PR, so I'll close this one. Our downstream builds are passing again after the reverts, e.g. https://github.com/ROCm/TheRock/actions/runs/16675827094. I'll also double check that the PR stack builds successfully, and we're working towards getting these downstream builds moved closer to upstream - see #159520. I'm still not quite sure why only our downstream Windows ROCm builds were affected by the initialization issues in this code... maybe something to do with how we compile using clang-cl while other Windows builds use MSVC? I'm also fairly new to contributing to PyTorch so I'm still learning my way around the pull request workflows here. I really appreciate your fast responses, thanks! |
Confirmed. My local build of the |
|
@ScottTodd Thanks for your update! |
# Motivation As @ScottTodd identified in this [comment](#150312 (comment)), using STL containers like `std::string` and `std::unordered_set` at static init time can cause static initialization order issues. This PR is based on and modified from his original PR: #159607. I’m stacking this PR here to help facilitate the landing and validation process. Co-authored-by: @ScottTodd Pull Request resolved: #159629 Approved by: https://github.com/ScottTodd, https://github.com/albanD
# Motivation As @ScottTodd identified in this [comment](pytorch#150312 (comment)), using STL containers like `std::string` and `std::unordered_set` at static init time can cause static initialization order issues. This PR is based on and modified from his original PR: pytorch#159607. I’m stacking this PR here to help facilitate the landing and validation process. Co-authored-by: @ScottTodd Pull Request resolved: pytorch#159629 Approved by: https://github.com/ScottTodd, https://github.com/albanD
This fixes DLL load errors detected downstream at ROCm/TheRock#1155 and reported at:
Static initialization at class scope using a data structure from the STL is not safe (especially in Windows DLLs):
I'm also fine with other solutions or reverting the changes that introduced these bugs, sending this as one option to consider.