-
-
Notifications
You must be signed in to change notification settings - Fork 56.2k
Enable multicore CUDA compilation #24382
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Specify the number of threads to be used for compilation to the CUDA compiler NVCC. Option introduced in NVCC 11.2.
Parallel builds with GNU Make run multiple instances of NVCC, therefor there is no need to use the internal parallelism of NVCC.
msbuild compiles the CUDA files sequentially. Enable the internal parallelism of NVCC if msbuild is used.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
@cudawarped Could you take a look again. |
@asmorkalov I've had a quick look and it seems that there is a significant improvement in build time when using this modification with Visual Studio. Below is a table of the build time (mm:ss) for cudaimgproc and its dependancies on Windows and Ubuntu for two binary and one ptx output. The build time for VS when setting the
|
Enable multicore CUDA compilation opencv#24382 CUDA source files are compiled single threaded. The option `--threads` was introduced in NVCC 11.2. The option specifies the number of threads to be used for compilation (see [NVIDIA NVCC Documentation](https://docs.nvidia.com/cuda/cuda-compiler-driver-nvcc/index.html#threads-number-t)). With CMake 3.12 the environment variable `CMAKE_BUILD_PARALLEL_LEVEL` was introduced (see [CMake Documentation](https://cmake.org/cmake/help/latest/envvar/CMAKE_BUILD_PARALLEL_LEVEL.html)). This variable is used to set the NVCC `--threads` option. ### Pull Request Readiness Checklist See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request - [x] I agree to contribute to the project under Apache 2 License. - [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV - [x] The PR is proposed to the proper branch - [ ] There is a reference to the original bug report and related work - [ ] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name. - [ ] The feature is well documented and sample code can be built with the project CMake
Enable multicore CUDA compilation opencv#24382 CUDA source files are compiled single threaded. The option `--threads` was introduced in NVCC 11.2. The option specifies the number of threads to be used for compilation (see [NVIDIA NVCC Documentation](https://docs.nvidia.com/cuda/cuda-compiler-driver-nvcc/index.html#threads-number-t)). With CMake 3.12 the environment variable `CMAKE_BUILD_PARALLEL_LEVEL` was introduced (see [CMake Documentation](https://cmake.org/cmake/help/latest/envvar/CMAKE_BUILD_PARALLEL_LEVEL.html)). This variable is used to set the NVCC `--threads` option. ### Pull Request Readiness Checklist See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request - [x] I agree to contribute to the project under Apache 2 License. - [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV - [x] The PR is proposed to the proper branch - [ ] There is a reference to the original bug report and related work - [ ] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name. - [ ] The feature is well documented and sample code can be built with the project CMake
Enable multicore CUDA compilation opencv#24382 CUDA source files are compiled single threaded. The option `--threads` was introduced in NVCC 11.2. The option specifies the number of threads to be used for compilation (see [NVIDIA NVCC Documentation](https://docs.nvidia.com/cuda/cuda-compiler-driver-nvcc/index.html#threads-number-t)). With CMake 3.12 the environment variable `CMAKE_BUILD_PARALLEL_LEVEL` was introduced (see [CMake Documentation](https://cmake.org/cmake/help/latest/envvar/CMAKE_BUILD_PARALLEL_LEVEL.html)). This variable is used to set the NVCC `--threads` option. ### Pull Request Readiness Checklist See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request - [x] I agree to contribute to the project under Apache 2 License. - [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV - [x] The PR is proposed to the proper branch - [ ] There is a reference to the original bug report and related work - [ ] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name. - [ ] The feature is well documented and sample code can be built with the project CMake
CUDA source files are compiled single threaded. The option
--threads
was introduced in NVCC 11.2. The option specifies the number of threads to be used for compilation (see NVIDIA NVCC Documentation).With CMake 3.12 the environment variable
CMAKE_BUILD_PARALLEL_LEVEL
was introduced (see CMake Documentation). This variable is used to set the NVCC--threads
option.Pull Request Readiness Checklist
See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-p 8000 ull-request
Patch to opencv_extra has the same branch name.