8000 [NCCL] Don't override `waitUntilInitialized`'s setting of `comm->init… · pytorch/pytorch@0b1b609 · GitHub 8000
[go: up one dir, main page]

Skip to content

Commit 0b1b609

Browse files
pytorchboteqy
andauthored
[NCCL] Don't override waitUntilInitialized's setting of comm->initialized_ (#137210)
[NCCL] Don't override `waitUntilInitialized`'s setting of `comm->initialized_` (#136155) #133630 sets `initialized_` to `true` which causes previous wait codepaths to skip necessary waits, see also ##136151 CC @shuqiangzhang @wconstab Pull Request resolved: #136155 Approved by: https://github.com/fduwjj, https://github.com/kwen2501, https://github.com/c-p-i-o, https://github.com/shuqiangzhang (cherry picked from commit e3aa5e2) Co-authored-by: eqy <eddiey@nvidia.com>
1 parent 0b45af9 commit 0b1b609

File tree

1 file changed

+3
-1
lines changed

1 file changed

+3
-1
lines changed

torch/csrc/distributed/c10d/NCCLUtils.cpp

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -84,7 +84,9 @@ std::shared_ptr<NCCLComm> NCCLComm::split(
8484
std::nullopt);
8585
++source->ncclCommSplitCounter_;
8686
comm->rank_ = rank;
87-
comm->initialized_ = true;
87+
if (!nccl_use_nonblocking()) {
88+
comm->initialized_ = true;
89+
}
8890
return comm;
8991
}
9092
#endif

0 commit comments

Comments
 (0)
0