8000 [PP] Fix disabled flaky tests (#154856) · ROCm/pytorch@d71a41b · GitHub
[go: up one dir, main page]

Skip to content

Commit d71a41b

Browse files
H-Huangiupaikov-amd
authored andcommitted
[PP] Fix disabled flaky tests (pytorch#154856)
Fix pytorch#154373, pytorch#154391, pytorch#154408, pytorch#154443, pytorch#154481 Because MultiProcContinousTest [now executes the tests with 8 GPUs instead of 2](pytorch#153653), our PP tests comparing gradients have become flakier due to the longer pipeline. The gradients are still close but we need to relax the tolerance. Pull Request resolved: pytorch#154856 Approved by: https://github.com/Skylion007
1 parent 803d7b6 commit d71a41b

File tree

1 file changed

+1
-1
lines changed

1 file changed

+1
-1
lines changed

test/distributed/pipelining/test_schedule_multiproc.py

Copy file name to clipboard
Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -513,7 +513,7 @@ def test_grad_with_manual_interleaved(self, ScheduleClass, use_new_runtime):
513513
for name, p in stage_module.named_parameters():
514514
ref_p = ref_submod.get_parameter(name)
515515
try:
516-
torch.testing.assert_close(p.grad, ref_p.grad, rtol=1e-5, atol=4e-5)
516+
torch.testing.assert_close(p.grad, ref_p.grad, rtol=1e-5, atol=1e-3)
517517
except AssertionError:
518518
print(f"Gradient test failed for {name}: {p.grad} vs {ref_p.grad}")
519519
raise

0 commit comments

Comments
 (0)
0