8000 DISABLED test_forward_mode_AD_linalg_det_singular_cuda_complex128 (__main__.TestFwdGradientsCUDA) · Issue #93045 · pytorch/pytorch · GitHub
[go: up one dir, main page]

Skip to content

DISABLED test_forward_mode_AD_linalg_det_singular_cuda_complex128 (__main__.TestFwdGradientsCUDA) #93045

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
huydhn opened this issue Jan 26, 2023 · 9 comments
Labels
module: autograd Related to torch.autograd, and the autograd engine in general module: flaky-tests Problem is a flaky test in CI module: rocm AMD GPU support for Pytorch skipped Denotes a (flaky) test currently skipped in CI. triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module

Comments

@huydhn
Copy link
Contributor
huydhn commented Jan 26, 2023

Platforms: rocm

This test was disabled because it is failing on master (recent examples).

cc @ezyang @albanD @zou3519 @gqchen @pearu @nikitaved @soulitzer @lezcano @Varal7 @jeffdaily @sunway513 @jithunnair-amd @pruthvistony @ROCmSupport

@pytorch-bot
Copy link
pytorch-bot bot commented Jan 26, 2023
Hello there! From the DISABLED prefix in this issue title, it looks like you are attempting to disable a test in PyTorch CI. The information I have parsed is below:
  • Test name: test_forward_mode_AD_linalg_det_singular_cuda_complex128 (__main__.TestFwdGradientsCUDA)
  • Platforms for which to skip the test: rocm

Within ~15 minutes, test_forward_mode_AD_linalg_det_singular_cuda_complex128 (__main__.TestFwdGradientsCUDA) will be disabled in PyTorch CI for these platforms: rocm. Please verify that your test name looks correct, e.g., test_cuda_assert_async (__main__.TestCuda).

To modify the platforms list, please include a line in the issue body, like below. The default action will disable the test for all platforms if no platforms list is specified.

Platforms: case-insensitive, list, of, platforms

We currently support the following platforms: asan, dynamo, linux, mac, macos, rocm, win, windows.

@pytorch-bot pytorch-bot bot added the skipped Denotes a (flaky) test currently skipped in CI. label Jan 26, 2023
@huydhn huydhn added module: rocm AMD GPU support for Pytorch module: autograd Related to torch.autograd, and the autograd engine in general module: flaky-tests Problem is a flaky test in CI labels Jan 26, 2023
@huydhn
Copy link
Contributor Author
huydhn commented Jan 26, 2023

Probably has the same root cause as #93044, this starts to become flaky on ROCM

@albanD albanD added the triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module label Jan 27, 2023
@pytorch-bot
Copy link
pytorch-bot bot commented Feb 15, 2023

Resolving the issue because the test is not flaky anymore after 400 reruns without any failures and the issue hasn't been updated in 14 days. Please reopen the issue to re-disable the test if you think this is a false positive

@pytorch-bot
Copy link
pytorch-bot bot commented Mar 2, 2023

Resolving the issue because the test is not flaky anymore after 400 reruns without any failures and the issue hasn't been updated in 14 days. Please reopen the issue to re-disable the test if you think this is a false positive

@pytorch-bot pytorch-bot bot closed this as completed Mar 2, 2023
@clee2000 clee2000 reopened this Mar 2, 2023
pytorchmergebot pushed a commit that referenced this issue Mar 14, 2023
Related issues: #93044 and #93045.

* No access to runner to debug ROCm flakiness
* Haven't seen any update on the two issues above
* Tests are still flaky whenever they are closed

### Testing

The tests are skipped https://ossci-raw-job-status.s3.amazonaws.com/log/11976899251

```
2023-03-14T03:39:02.1336514Z test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_linalg_det_singular_cuda_complex128 SKIPPED (Flaky on ROCm #93044) [ 27%]
...
2023-03-14T03:41:46.4234072Z test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_linalg_det_singular_cuda_complex128 SKIPPED (Flaky on ROCm #93045) [ 44%]
```

Pull Request resolved: #96707
Approved by: https://github.com/clee2000
@pytorch-bot
Copy link
pytorch-bot bot commented Mar 16, 2023

Resolving the issue because the test is not flaky anymore after 200 reruns without any failures and the issue hasn't been updated in 14 days. Please reopen the issue to re-disable the test if you think this is a false positive

@pytorch-bot pytorch-bot bot closed this as completed Mar 16, 2023
@jithunnair-amd
Copy link
Collaborator

Reopening this issue because we're seeing this test fail in local testing with ROCm5.4. Plan to keep the test disabled until we root-cause it.
cc @jaglinux

@huydhn
Copy link
Contributor Author
huydhn commented Mar 20, 2023

Reopening this issue because we're seeing this test fail in local testing with ROCm5.4. Plan to keep the test disabled until we root-cause it. cc @jaglinux

I have disabled this test fully for ROCm recently in #96707. Once you have that change, the failure won't show up. The root cause is still there though, so agree that it would still need to be root cause

@lezcano
Copy link
Collaborator
lezcano commented Mar 21, 2023

I'd be happy to simply remove these tests, as they have proven to be too flaky. If any one wants to put up a PR removing the det_singular OpInfo I'm happy to approve it.

cyyever pushed a commit to cyyever/pytorch_private that referenced this issue Mar 23, 2023
Related issues: pytorch/pytorch#93044 and pytorch/pytorch#93045.

* No access to runner to debug ROCm flakiness
* Haven't seen any update on the two issues above
* Tests are still flaky whenever they are closed

### Testing

The tests are skipped https://ossci-raw-job-status.s3.amazonaws.com/log/11976899251

```
2023-03-14T03:39:02.1336514Z test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_linalg_det_singular_cuda_complex128 SKIPPED (Flaky on ROCm pytorch/pytorch#93044) [ 27%]
...
2023-03-14T03:41:46.4234072Z test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_linalg_det_singular_cuda_complex128 SKIPPED (Flaky on ROCm pytorch/pytorch#93045) [ 44%]
```

Pull Request resolved: pytorch/pytorch#96707
Approved by: https://github.com/clee2000
cyyever pushed a commit to cyyever/pytorch_private that referenced this issue Mar 27, 2023
Related issues: pytorch/pytorch#93044 and pytorch/pytorch#93045.

* No access to runner to debug ROCm flakiness
* Haven't seen any update on the two issues above
* Tests are still flaky whenever they are closed

### Testing

The tests are skipped https://ossci-raw-job-status.s3.amazonaws.com/log/11976899251

```
2023-03-14T03:39:02.1336514Z test_ops_gradients.py::TestBwdGradientsCUDA::test_fn_grad_linalg_det_singular_cuda_complex128 SKIPPED (Flaky on ROCm pytorch/pytorch#93044) [ 27%]
...
2023-03-14T03:41:46.4234072Z test_ops_fwd_gradients.py::TestFwdGradientsCUDA::test_forward_mode_AD_linalg_det_singular_cuda_complex128 SKIPPED (Flaky on ROCm pytorch/pytorch#93045) [ 44%]
```

Pull Request resolved: pytorch/pytorch#96707
Approved by: https://github.com/clee2000
arindamroy-eng added a commit to arindamroy-eng/pytorch-1 that referenced this issue May 31, 2023
According to comments in
pytorch#93045 (comment)

The test has been flaky all over.
Hence submitting this PR to remove the test itself.

Resolves (pytorch#93045).

Signed-off-by: Arindam Roy <rarindam@gmail.com>
pytorchmergebot pushed a commit to arindamroy-eng/pytorch-1 that referenced this issue Jun 12, 2023
According to comments in
pytorch#93045 (comment)

The test has been flaky all over.
Hence submitting this PR to remove the test itself.

Resolves (pytorch#93045).

Signed-off-by: Arindam Roy <rarindam@gmail.com>
pytorchmergebot pushed a commit to arindamroy-eng/pytorch-1 that referenced this issue Jun 20, 2023
According to comments in
pytorch#93045 (comment)

The test has been flaky all over.
Hence submitting this PR to remove the test itself.

Resolves (pytorch#93045).

Signed-off-by: Arindam Roy <rarindam@gmail.com>
soulitzer added a commit that referenced this issue Jan 23, 2025
Fixes #93045 #93044

Based on some previous attempts:
- #102581
- #109249


[ghstack-poisoned]
@soulitzer
Copy link
Contributor

Closing since it should be addressed by #96707

soulitzer added a commit that referenced this issue Jan 23, 2025
Fixes #93045 #93044

From previous discussion #93045 (comment) the resolution is that we're okay with removing this.

Some older attempts:
- #102581
- #109249


cc H-Huang awgu kwen2501 wanchaol fegin fduwjj wz337 wconstab d4l3k c-p-i-o voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx ipiszy yf225 chenyang78 kadeng muchulee8 ColinPeppler amjames desertfire chauhang aakhundov

[ghstack-poisoned]
soulitzer added a commit that referenced this issue Jan 23, 2025
Fixes #93045 #93044

From previous discussion #93045 (comment) the resolution is that we're okay with removing this.

Some older attempts:
- #102581
- #109249


cc H-Huang awgu kwen2501 wanchaol fegin fduwjj wz337 wconstab d4l3k c-p-i-o voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx ipiszy yf225 chenyang78 kadeng muchulee8 ColinPeppler amjames desertfire chauhang aakhundov

[ghstack-poisoned]
soulitzer added a commit that referenced this issue Jan 23, 2025
Fixes #93045 #93044

From previous discussion #93045 (comment) the resolution is that we're okay with removing this.

Some older attempts:
- #102581
- #109249


cc H-Huang awgu kwen2501 wanchaol fegin fduwjj wz337 wconstab d4l3k c-p-i-o voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx ipiszy yf225 chenyang78 kadeng muchulee8 ColinPeppler amjames desertfire chauhang aakhundov

[ghstack-poisoned]
soulitzer added a commit that referenced this issue Jan 23, 2025
Fixes #93045 #93044

From previous discussion #93045 (comment) the resolution is that we're okay with removing this.

Some older attempts:
- #102581
- #109249


cc H-Huang awgu kwen2501 wanchaol fegin fduwjj wz337 wconstab d4l3k c-p-i-o voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx ipiszy yf225 chenyang78 kadeng muchulee8 ColinPeppler amjames desertfire chauhang aakhundov

[ghstack-poisoned]
soulitzer added a commit that referenced this issue Jan 24, 2025
Fixes #93045 #93044

From previous discussion #93045 (comment) the resolution is that we're okay with removing this.

Some older attempts:
- #102581
- #109249


cc H-Huang awgu kwen2501 wanchaol fegin fduwjj wz337 wconstab d4l3k c-p-i-o voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx ipiszy yf225 chenyang78 kadeng muchulee8 ColinPeppler amjames desertfire chauhang aakhundov

[ghstack-poisoned]
soulitzer added a commit that referenced this issue Jan 24, 2025
Fixes #93045 #93044

From previous discussion #93045 (comment) the resolution is that we're okay with removing this.

Some older attempts:
- #102581
- #109249


cc H-Huang awgu kwen2501 wanchaol fegin fduwjj wz337 wconstab d4l3k c-p-i-o voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx ipiszy yf225 chenyang78 kadeng muchulee8 ColinPeppler amjames desertfire chauhang aakhundov

[ghstack-poisoned]
soulitzer added a commit that referenced this issue Jan 24, 2025
Fixes #93045 #93044

From previous discussion #93045 (comment) the resolution is that we're okay with removing this.

Some older attempts:
- #102581
- #109249


cc H-Huang awgu kwen2501 wanchaol fegin fduwjj wz337 wconstab d4l3k c-p-i-o voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx ipiszy yf225 chenyang78 kadeng muchulee8 ColinPeppler amjames desertfire chauhang aakhundov

[ghstack-poisoned]
soulitzer added a commit that referenced this issue Jan 24, 2025
Fixes #93045 #93044

From previous discussion #93045 (comment) the resolution is that we're okay with removing this.

Some older attempts:
- #102581
- #109249


cc H-Huang awgu kwen2501 wanchaol fegin fduwjj wz337 wconstab d4l3k c-p-i-o voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx ipiszy yf225 chenyang78 kadeng muchulee8 ColinPeppler amjames desertfire chauhang aakhundov

[ghstack-poisoned]
pytorchmergebot pushed a commit that referenced this issue Jan 25, 2025
Fixes #93045 #93044

From previous discussion #93045 (comment) the resolution is that we're okay with removing this.

Some older attempts:
- #102581
- #109249

Pull Request resolved: #145533
Approved by: https://github.com/lezcano, https://github.com/malfet
ghstack dependencies: #145520, #145531
nWEIdia pushed a commit to nWEIdia/pytorch that referenced this issue Jan 27, 2025
Fixes pytorch#93045 pytorch#93044

From previous discussion pytorch#93045 (comment) the resolution is that we're okay with removing this.

Some older attempts:
- pytorch#102581
- pytorch#109249

Pull Request resolved: pytorch#145533
Approved by: https://github.com/lezcano, https://github.com/malfet
ghstack dependencies: pytorch#145520, pytorch#145531
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
module: autograd Related to torch.autograd, and the autograd engine in general module: flaky-tests Problem is a flaky test in CI module: rocm AMD GPU support for Pytorch skipped Denotes a (flaky) test currently skipped in CI. triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module
Projects
None yet
6 participants
0