8000 [ROCm] Fixes and improvements to CUDA->HIP flag conversion for CPP extensions by naromero77amd · Pull Request #149245 · pytorch/pytorch · GitHub
[go: up one dir, main page]

Skip to content

[ROCm] Fixes and improvements to CUDA->HIP flag conversion for CPP extensions #149245

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 2 commits into from

Conversation

naromero77amd
Copy link
Collaborator
@naromero77amd naromero77amd commented Mar 15, 2025

Fixes ROCm/hip#3764.

Fixes and improvements to CUDA->HIP flag conversion for CPP extensions

  • Log flag conversion for debugging purposes.
  • Fix cases where it should not touch the -I flags or cases where CUDA appears more than once by replacing only the first instance.
  • Fix case where nvcc key may not exist
  • Fix case where hipify should ignore flag values and only touch the flag itself

cc @jeffdaily @sunway513 @jithunnair-amd @pruthvistony @ROCmSupport @dllehr-amd @jataylo @hongxiayang

Copy link
pytorch-bot bot commented Mar 15, 2025

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/149245

Note: Links to docs will display an error until the docs builds have been completed.

✅ You can merge normally! (1 Unrelated Failure)

As of commit f6e1b52 with merge base 790f93d (image):

UNSTABLE - The following job is marked as unstable, possibly due to flakiness on trunk:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@pytorch-bot pytorch-bot bot added module: rocm AMD GPU support for Pytorch ciflow/rocm Trigger "default" config CI on ROCm labels Mar 15, 2025
@naromero77amd naromero77amd requested a review from jeffdaily March 15, 2025 01:59
@naromero77amd naromero77amd added the topic: not user facing topic category label Mar 15, 2025
@zou3519 zou3519 added the triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module label Mar 17, 2025
@pytorch-bot pytorch-bot bot removed the ciflow/rocm Trigger "default" config CI on ROCm label Mar 17, 2025
Copy link
linux-foundation-easycla bot commented Mar 17, 2025

CLA Signed

The committers listed above are authorized under a signed CLA.

  • ✅ login: Qubitium / name: Qubitium-ModelCloud (f6e1b52)
  • ✅ login: naromero77amd / name: Nichols A. Romero (62adf35)

@jeffdaily
Copy link
Collaborator

@Qubitium please sign the CLA so we can merge this PR.

@Qubitium
Copy link
Contributor
Qubitium commented Mar 17, 2025

@jeffdaily @naromero77amd I was not able to count on ROCm/torch CI (broken) to run to check for CI regressions. Please make sure to compile full torch with the code and check/validate. Thanks.

@Qubitium
Copy link
Contributor

@naromero77amd @jeffdaily I think it would be good to update the PR notes and change

A user convenience.

To something more accurate for PR history. Because this message makes it sound like it's user's fault. And devs would say this is actually a missing piece of the hipify process and it should've been doing all along. Hipify, the process, overrides everything cuda, including compilers, so it makes total sense from developer's point of view that everything includes compiler flags.

@naromero77amd naromero77amd changed the title [ROCm] simple hipify of nvcc flags for CUDAExtension [ROCm] Fixes and improvements to hipify for CUDAExtension Mar 17, 2025
@pytorch-bot pytorch-bot bot added the ciflow/rocm Trigger "default" config CI on ROCm label Mar 17, 2025
@naromero77amd naromero77amd changed the title [ROCm] Fixes and improvements to hipify for CUDAExtension [ROCm] Fixes and improvements to HIP->CUDA for CPP Extension Mar 17, 2025
@naromero77amd
Copy link
Collaborator Author

@Qubitium Updated description. As a co-author you should also be able to make edits to the title and/or description.

@naromero77amd @jeffdaily I think it would be good to update the PR notes and change

A user convenience.

To something more accurate for PR history. Because this message makes it sound like it's user's fault. And devs would say this is actually a missing piece of the hipify process and it should've been doing all along. Hipify, the process, overrides everything cuda, including compilers, so it makes total sense from developer's point of view that everything includes compiler flags.

@naromero77amd naromero77amd changed the title [ROCm] Fixes and improvements to HIP->CUDA for CPP Extension [ROCm] Fixes and improvements to CUDA->HIP flag conversion for CPP Extension Mar 17, 2025
@naromero77amd naromero77amd changed the title [ROCm] Fixes and improvements to CUDA->HIP flag conversion for CPP Extension [ROCm] Fixes and improvements to CUDA->HIP flag conversion for CPP extensions Mar 17, 2025
@naromero77amd
Copy link
Collaborator Author

@pytorchbot rebase

@pytorchmergebot
Copy link
Collaborator

@pytorchbot started a rebase job onto refs/remotes/origin/viable/strict. Check the current status here

naromero77amd and others added 2 commits March 17, 2025 21:41
I made improvements to the exising PR to that fixes CUDA to HIP flag
conversion.

* Log: We should log this behavior since this can break hipcc down the
line for all we know
* Log: Make it transparent to user what hipify actions are doing so devs
are not left in the dark.
* Fix cases where it should not touch the `-I` flags or case where CUDA
appear more than once by replacing only the first instance.
* Fix case where `nvcc` key may not exist 
* Fix case where hipify should ignore flag values and only touch the
flag itself

@naromero77amd 

cc @jeffdaily @sunway513 @jithunnair-amd @pruthvistony @ROCmSupport
@dllehr-amd @jataylo @hongxiayang

Passes the following unit test

```py
import unittest
import sys
from unittest.mock import Mock

class TestHipifyCompileFlags(unittest.TestCase):
    def test_hipify_compile_flags(self):

        class DummyClass:
            # Simple hipify, replace the first occurrence of CUDA with HIP
            # in flags starting with "-" and containing "CUDA", but exclude -I flags
            def _hipify_compile_flags(self, extension):
                if isinstance(extension.extra_compile_args, dict) and 'nvcc' in extension.extra_compile_args:
                    modified_flags = []
                    for flag in extension.extra_compile_args['nvcc']:
                        if flag.startswith("-") and "CUDA" in flag and not flag.startswith("-I"):
                            # check/split flag into flag and value
                            parts = flag.split("=", 1)
                            if len(parts) == 2:
                                flag_part, value_part = parts
                                # replace fist instance of "CUDA" with "HIP" only in the flag and not flag value
                                modified_flag_part = flag_part.replace("CUDA", "HIP", 1)
                                modified_flag = f"{modified_flag_part}={value_part}"
                            else:
                                # replace fist instance of "CUDA" with "HIP" in flag
                                modified_flag = flag.replace("CUDA", "HIP", 1)
                            modified_flags.append(modified_flag)
                            print(f'Modified flag: {flag} -> {modified_flag}', file=sys.stderr)
                        else:
                            modified_flags.append(flag)
                    extension.extra_compile_args['nvcc'] = modified_flags

        dummy = DummyClass()
        extension = Mock()

        # (tested flag, expected flag)
        test_cases = [
            ('-CUDA', '-HIP'),
            ('--CUDA', '--HIP'),
            ('--CUDA_CUDA', '--HIP_CUDA'),  # only first CUDA instance
            ('-U__CUDA', '-U__HIP'),
            ('-D__CUDA', '-D__HIP'),
            ('-I/usr/local/CUDA', '-I/usr/local/CUDA'),  # skip includes
            ('-Dmyflag=CUDA', '-Dmyflag=CUDA'),  # should not be touched
        ]

        tested_flags = [test_case[0] for test_case in test_cases]
        expected_flags = [test_case[1] for test_case in test_cases]

        # test `nvcc` key exists
        extension.extra_compile_args = {'nvcc': tested_flags}
        dummy._hipify_compile_flags(extension)
        self.assertEqual(extension.extra_compile_args['nvcc'], expected_flags)

        # test `nvcc` key does not exists
        test_cases = {'abcd': ["-DTest", "--TEST"]}
        extension.extra_compile_args = test_cases
        dummy._hipify_compile_flags(extension)
        # Assert the result
        self.assertEqual(extension.extra_compile_args['abcd'], test_cases['abcd'])

if __name__ == '__main__':
    unittest.main()
```
@pytorchmergebot
Copy link
Collaborator

Successfully rebased rocm_hipify_flags onto refs/remotes/origin/viable/strict, please pull locally before adding more changes (for example, via git checkout rocm_hipify_flags && git pull --rebase)

@pytorch-bot pytorch-bot bot removed the ciflow/rocm Trigger "default" config CI on ROCm label Mar 17, 2025
@jeffdaily jeffdaily added the ciflow/rocm Trigger "default" config CI on ROCm label Mar 18, 2025
@naromero77amd
Copy link
Collaborator Author

@pytorchbot merge

@pytorch-bot pytorch-bot bot added the ciflow/trunk Trigger trunk jobs on your pull request label Mar 18, 2025
@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

@naromero77amd naromero77amd deleted the rocm_hipify_flags branch March 18, 2025 19:44
@jeffdaily
Copy link
Collaborator

@pytorchbot cherry-pick --onto release/2.7 -c critical

pytorchbot pushed a commit that referenced this pull request Mar 18, 2025
…tensions (#149245)

Fixes ROCm/hip#3764.

Fixes and improvements to CUDA->HIP flag conversion for CPP extensions

- Log flag conversion for debugging purposes.
- Fix cases where it should not touch the -I flags or cases where CUDA appears more than once by replacing only the first instance.
- Fix case where nvcc key may not exist
- Fix case where hipify should ignore flag values and only touch the flag itself

Pull Request resolved: #149245
Approved by: https://github.com/jeffdaily

Co-authored-by: Qubitium-ModelCloud <qubitium@modelcloud.ai>
(cherry picked from commit c0566e0)
@pytorchbot
Copy link
Collaborator

Cherry picking #149245

The cherry pick PR is at #149432 and it is recommended to link a critical cherry pick PR with an issue. The following tracker issues are updated:

Details for Dev Infra team Raised by workflow job

malfet pushed a commit that referenced this pull request Mar 27, 2025
…tensions (#149432)

[ROCm] Fixes and improvements to CUDA->HIP flag conversion for CPP extensions (#149245)

Fixes ROCm/hip#3764.

Fixes and improvements to CUDA->HIP flag conversion for CPP extensions

- Log flag conversion for debugging purposes.
- Fix cases where it should not touch the -I flags or cases where CUDA appears more than once by replacing only the first instance.
- Fix case where nvcc key may not exist
- Fix case where hipify should ignore flag values and only touch the flag itself

Pull Request resolved: #149245
Approved by: https://github.com/jeffdaily

Co-authored-by: Qubitium-ModelCloud <qubitium@modelcloud.ai>
(cherry picked from commit c0566e0)

Co-authored-by: Nichols A. Romero <nick.romero@amd.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ciflow/rocm Trigger "default" config CI on ROCm ciflow/trunk Trigger trunk jobs on your pull request Merged module: rocm AMD GPU support for Pytorch open source topic: not user facing topic category triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Issue] Incorrect mapping nvcc flags to hipcc flags
6 participants
0