8000 [c10d] PGNCCL refactor part 2: Simplify ProcessGroupNCCL into single-device style by kwen2501 · Pull Request #119421 · pytorch/pytorch · GitHub
[go: up one dir, main page]

Skip to content

[c10d] PGNCCL refactor part 2: Simplify ProcessGroupNCCL into single-device style #119421

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 14 commits into from
Closed
Prev Previous commit
Next Next commit
Lint
  • Loading branch information
kwen2501 committed Feb 8, 2024
commit 4de6892651737a8450c07026c237d8dfc072ad0b
4 changes: 2 additions & 2 deletions torch/csrc/distributed/c10d/ProcessGroupNCCL.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -3254,8 +3254,8 @@ c10::intrusive_ptr<Work> ProcessGroupNCCL::reduce_scatter(
output.storage().data_ptr(), stream);
}
const auto ncclDataType = getNcclDataType(input.scalar_type());
const auto ncclReduceOp = getNcclReduceOp(
opts.reduceOp, input, ncclDataType, comm);
const auto ncclReduceOp =
getNcclReduceOp(opts.reduceOp, input, ncclDataType, comm);
return ncclReduceScatter(
input.data_ptr(),
output.data_ptr(),
Expand Down
0