[inductor] Fix MKL issue with test_indirect_device_assert #108172

jansel · 2023-08-29T17:56:24Z

Stack from ghstack (oldest at bottom):

-> [inductor] Fix MKL issue with test_indirect_device_assert #108172

cc @voznesenskym @penguinwu @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @Xia-Weiwen @wenzhe-nrv @jiayisunx @peterbell10 @ipiszy @ngimel @yf225 @chenyang78 @kadeng @muchulee8 @aakhundov @anijain2305

[ghstack-poisoned]

pytorch-bot · 2023-08-29T18:04:17Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/108172

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki or our office hours

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 2f9df0f with merge base fe1f26a ():
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx peterbell10 ipiszy ngimel yf225 chenyang78 kadeng muchulee8 aakhundov anijain2305 [ghstack-poisoned]

peterbell10 · 2023-08-29T21:41:13Z

test/inductor/test_torchinductor.py

@@ -7543,6 +7544,7 @@ def fn(x: torch.Tensor) -> torch.Tensor:
                    fn_opt(inps)

        @skipIfRocm
+        @skip_if_consumer_card


The test works for me on a 2060. What issues are you seeing on a 3090?

Subprocess is failing with:

b'Error: mkl-service + Intel(R) MKL: MKL_THREADING_LAYER=INTEL is incompatible with libgomp.so.1 library.\n\tTry to import numpy first or set the threading layer accordingly. Set MKL_SERVICE_FORCE_INTEL to force it.\n'

Actually might not be related to consumer cards, that was a guess based on test names and it only failing on my desktop not CI. I think I can fix it with MKL_THREADING_LAYER=GNU.

That looks like #37377 which should be fixed by updating mkl-service. Your workaround seems fine though.

cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx peterbell10 ipiszy ngimel yf225 chenyang78 kadeng muchulee8 aakhundov anijain2305 [ghstack-poisoned]

jansel · 2023-08-30T17:32:32Z

@pytorchbot merge

pytorchmergebot · 2023-08-30T17:37:08Z

Merge started

Your change will be merged once all chec 8000 ks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

pytorchmergebot · 2023-08-30T17:37:13Z

Merge failed

Reason: 1 jobs have failed, first few of them are: inductor / cuda12.1-py3.10-gcc9-sm86 / test (inductor_distributed, 1, 1, linux.g5.12xlarge.nvidia.gpu)

Details for Dev Infra team

Raised by workflow job

jansel · 2023-08-30T17:45:46Z

@pytorchbot rebase

pytorchmergebot · 2023-08-30T18:58:43Z

@pytorchbot started a rebase job onto refs/remotes/origin/viable/strict. Check the current status here

cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx peterbell10 ipiszy ngimel yf225 chenyang78 kadeng muchulee8 aakhundov anijain2305 [ghstack-poisoned]

pytorchmergebot · 2023-08-30T18:58:59Z

Successfully rebased gh/jansel/170/orig onto refs/remotes/origin/viable/strict, please pull locally before adding more changes (for example, via ghstack checkout https://github.com/pytorch/pytorch/pull/108172)

ghstack-source-id: b91f2e7 Pull Request resolved: #108172

jansel · 2023-08-30T18:59:52Z

@pytorchbot merge

pytorchmergebot · 2023-08-30T19:05:14Z

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

[inductor] Skip test_indirect_device_assert on RTX3090

733fb55

[ghstack-poisoned]

This was referenced Aug 29, 2023

[inductor] Fix inputs with existing offsets #108168

Closed

[inductor] Replace empty_strided with empty in simple cases #108169

Closed

github-actions bot added module: inductor module: dynamo ciflow/inductor labels Aug 29, 2023

jansel requested a review from peterbell10 August 29, 2023 18:20

peterbell10 reviewed Aug 29, 2023

View reviewed changes

jansel changed the title ~~[inductor] Skip test_indirect_device_assert on RTX3090~~ [inductor] Fix MKL issue with test_indirect_device_assert Aug 29, 2023

pytorch-bot bot added the topic: not user facing topic category label Aug 29, 2023

jansel requested a review from peterbell10 August 29, 2023 22:18

peterbell10 approved these changes Aug 29, 2023

View reviewed changes

pytorch-bot bot added the ciflow/trunk Trigger trunk jobs on your pull request label Aug 30, 2023

pytorchmergebot added the merging label Aug 30, 2023

pytorchmergebot removed the merging label Aug 30, 2023

pytorchmergebot pushed a commit that referenced this pull request Aug 30, 2023

[inductor] Fix MKL issue with test_indirect_device_assert

b389927

ghstack-source-id: b91f2e7 Pull Request resolved: #108172

pytorchmergebot added the merging label Aug 30, 2023

pytorchmergebot added Merged and removed merging labels Aug 30, 2023

pytorchmergebot closed this in 8a089f6 Aug 30, 2023

facebook-github-bot deleted the gh/jansel/170/head branch September 3, 2023 14:24

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[inductor] Fix MKL issue with test_indirect_device_assert #108172

[inductor] Fix MKL issue with test_indirect_device_assert #108172

[inductor] Fix MKL issue with test_indirect_device_assert #108172

[inductor] Fix MKL issue with test_indirect_device_assert #108172

Conversation

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/108172

✅ No Failures

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Merge started

Merge failed

Merge started