You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
[CD] Fix slim-wheel nvjit-link import problem (#141063)
When other toolkit (say CUDA-12.3) is installed and `LD_LIBRARY_PATH` points to there, import torch will fail with
```
ImportError: /usr/local/lib/python3.10/dist-packages/torch/lib/../../nvidia/cusparse/lib/libcusparse.so.12: undefined symbol: __nvJitLinkComplete_12_4, version libnvJitLink.so.12
```
It could not be worked around by tweaking rpath, as it also depends on the library load order, which are not guaranteed by any linker. Instead solve this by preloading `nvjitlink` right after global deps are loaded, by running something along the lines of the following
```python
if version.cuda in ["12.4", "12.6"]:
with open("/proc/self/maps") as f:
_maps = f.read()
# libtorch_global_deps.so always depends in cudart, check if its installed via wheel
if "nvidia/cuda_runtime/lib/libcudart.so" in _maps:
# If all abovementioned conditions are met, preload nvjitlink
_preload_cuda_deps("nvjitlink", "libnvJitLink.so.*[0-9]")
```
Fixes#140797
Pull Request resolved: #141063
Approved by: https://github.com/kit1980
Co-authored-by: Sergii Dymchenko <sdym@meta.com>
0 commit comments