[PTD BE DAY]Burn Down Distributed Disabled Tests!! #132845
Labels
oncall: distributed
Add this issue/PR to distributed oncall triage queue
triaged
This issue has been looked at a team member, and triaged and prioritized into an appropriate module
Uh oh!
There was an error while loading. Please reload this page.
Hey, Folks! We have 59 flaky distributed tests. I have grouped them into a few categories below. We have some flaky tests from today and dated back to Jan 11, 2019. Let's see if we can burn them down or deprecate the test no longer useful.
If you would like to take an issue, please check the box on this page and assign the issue to yourself. Thanks!
C10D
NCCL
GLOO
Tcpstore
MultiProcessing
MultiThreadedTestCase
RPC
DeviceMesh and DTensor
DeviceMesh
DTensor
For some of the op tests below, @awgu did a bit digging and there might be some issues in our hashing/caching. More context in #132114
DDP, FSDP, PiPPy
FSDP1
FSDP2
DDP
pipeline
Other
DCP
Functional Collectives
ShardedTensor
cc @XilunWu @H-Huang @awgu @kwen2501 @wanchaol @fegin @fduwjj @wconstab @d4l3k @c-p-i-o
The text was updated successfully, but these errors were encountered: