10000 torch-2.0.0-rc1 and torch-1.13.1 can not be installed on Ubuntu 20.04 · Issue #91067 · pytorch/pytorch · GitHub
[go: up one dir, main page]

Skip to content

torch-2.0.0-rc1 and torch-1.13.1 can not be installed on Ubuntu 20.04 #91067

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
malfet opened this issue Dec 17, 2022 · 11 comments
Closed

torch-2.0.0-rc1 and torch-1.13.1 can not be installed on Ubuntu 20.04 #91067

malfet opened this issue Dec 17, 2022 · 11 comments
Assignees
Labels
high priority module: binaries Anything related to official binaries that we release to users triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module
Milestone

Comments

@malfet
Copy link
Contributor
malfet commented Dec 17, 2022

🐛 Describe the bug

  1. Allocate c5a.4xlarge instance, for example by running:
import boto3
ec2=boto3.resource("ec2")
rc=ec2.create_instances(ImageId="ami-031843d9eaa76ad7a",InstanceType="c5a.4xlarge",SecurityGroups=['ssh-allworld'],KeyName="nshulga-key",MinCount=1,MaxCount=1,BlockDeviceMappings=[{'DeviceName': '/dev/sda1','Ebs': {'DeleteOnTermination': True, 'VolumeSize': 150,'VolumeType': 'standard'}}])
  1. SSH into the instance and run python3 -mpip install torch
  2. Run python3 -c "import torch"

Above fails with:

$ python3 -c "import torch"
Traceback (most recent call last):
  File "/home/ubuntu/.local/lib/python3.8/site-packages/torch/__init__.py", line 172, in _load_global_deps
    ctypes.CDLL(lib_path, mode=ctypes.RTLD_GLOBAL)
  File "/usr/lib/python3.8/ctypes/__init__.py", line 373, in __init__
    self._handle = _dlopen(self._name, mode)
OSError: /home/ubuntu/.local/lib/python3.8/site-packages/torch/lib/../../nvidia/cublas/lib/libcublas.so.11: undefined symbol: cublasLtHSHMatmulAlgoInit, version libcublasLt.so.11

Versions

1.13.1, 1.13.0

cc @ezyang @gchanan @zou3519 @seemethere

@malfet malfet added high priority module: binaries Anything related to official binaries that we release to users labels Dec 17, 2022
@atalman
Copy link
Contributor
atalman commented Dec 19, 2022

We have following validation for this binary: https://github.com/pytorch/builder/actions/runs/3722990678/jobs/6314231329

It works when running inside conda environment using following steps:

conda create -n conda-test-mpip python=3.8
conda activate conda-test-mpip
pip3 install torch
python3 -c "import torch"

I was able to reproduce this failure with steps provided.
Could that be conflict with already preinstalled CUDA version on that AMI ?

echo $LD_LIBRARY_PATH
/opt/amazon/efa/lib:/opt/amazon/openmpi/lib:/usr/local/cuda/efa/lib:/usr/local/cuda/lib:/usr/local/cuda/lib64:/usr/local/cuda/extras/CUPTI/lib64:/usr/local/cuda/targets/x86_64-linux/lib:/usr/local/lib:/usr/lib:
ubuntu@ip-172-31-49-163:~$ ls /usr/local/cuda/lib64
libOpenCL.so               libcudart_static.a            libcusolver.so.10            libnppial_static.a        libnppim.so.11.1.0.245   libnvToolsExt.so.1
libOpenCL.so.1             libcufft.so                   libcusolver.so.10.6.0.245    libnppicc.so              libnppim_static.a        libnvToolsExt.so.1.0.0
libOpenCL.so.1.0           libcufft.so.10                libcusolverMg.so             libnppicc.so.11           libnppist.so             libnvblas.so
libOpenCL.so.1.0.0         libcufft.so.10.2.1.245        libcusolverMg.so.10          libnppicc.so.11.1.0.245   libnppist.so.11          libnvblas.so.11
libaccinj64.so             libcufft_static.a             libcusolverMg.so.10.6.0.245  libnppicc_static.a        libnppist.so.11.1.0.245  libnvblas.so.11.2.0.252
libaccinj64.so.11.0        libcufft_static_nocallback.a  libcusolver_static.a         libnppidei.so             libnppist_static.a       libnvjpeg.so
libaccinj64.so.11.0.221    libcufftw.so                  libcusparse.so               libnppidei.so.11          libnppisu.so             libnvjpeg.so.11
libcublas.so               libcufftw.so.10               libcusparse.so.11            libnppidei.so.11.1.0.245  libnppisu.so.11          libnvjpeg.so.11.1.1.245
libcublas.so.11            libcufftw.so.10.2.1.245       libcusparse.so.11.1.1.245    libnppidei_static.a       libnppisu.so.11.1.0.245  libnvjpeg_static.a
libcublas.so.11.2.0.252    libcufftw_static.a            libcusparse_static.a         libnppif.so               libnppisu_static.a       libnvrtc-builtins.so
libcublasLt.so             libcuinj64.so                 liblapack_static.a           libnppif.so.11            libnppitc.so             libnvrtc-builtins.so.11.0
libcublasLt.so.11          libcuinj64.so.11.0            libmetis_static.a            libnppif.so.11.1.0.245    libnppitc.so.11          libnvrtc-builtins.so.11.0.221
libcublasLt.so.11.2.0.252  libcuinj64.so.11.0.221        libnppc.so                   libnppif_static.a         libnppitc.so.11.1.0.245  libnvrtc.so
libcublasLt_static.a       libculibos.a                  libnppc.so.11                libnppig.so               libnppitc_static.a       libnvrtc.so.11.0
libcublas_static.a         libcurand.so                  libnppc.so.11.1.0.245        libnppig.so.11            libnpps.so               libnvrtc.so.11.0.221
libcudadevrt.a             libcurand.so.10               libnppc_static.a             libnppig.so.11.1.0.245    libnpps.so.11            stubs
libcudart.so               libcurand.so.10.2.1.245       libnppial.so                 libnppig_static.a         libnpps.so.11.1.0.245
libcudart.so.11.0          libcurand_static.a            libnppial.so.11              libnppim.so               libnpps_static.a
libcudart.so.11.0.221      libcusolver.so                libnppial.so.11.1.0.245      libnppim.so.11            libnvToolsExt.so

when doing ldd I see libs being present in the /home/ubuntu/.local/lib/python3.8/site-packages/nvidia/cublas/lib/libcublas.so.11 dir:
I see following:

 ldd libcublas.so.11 
	linux-vdso.so.1 (0x00007f4ab6e2a000)
	libcublasLt.so.11 => /usr/local/cuda/lib64/libcublasLt.so.11 (0x00007f4aa2a37000)
	librt.so.1 => /lib/x86_64-linux-gnu/librt.so.1 (0x00007f4aa2a15000)
	libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007f4aa29f2000)
	libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007f4aa29ec000)
	libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007f4aa289d000)
	libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007f4aa2880000)
	libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f4aa268e000)
	/lib64/ld-linux-x86-64.so.2 (0x00007f4ab6e2c000)
	libstdc++.so.6 => /lib/x86_64-linux-gnu/libstdc++.so.6 (0x00007f4aa24ac000)

This would explain why we are getting:

ubuntu@ip-172-31-49-163:~$ python3 -c "import torch"
Traceback (most recent call last):
  File "/home/ubuntu/.local/lib/python3.8/site-packages/torch/__init__.py", line 172, in _load_global_deps
    ctypes.CDLL(lib_path, mode=ctypes.RTLD_GLOBAL)
  File "/usr/lib/python3.8/ctypes/__init__.py", line 373, in __init__
    self._handle = _dlopen(self._name, mode)
OSError: /home/ubuntu/.local/lib/python3.8/site-packages/torch/lib/../../nvidia/cublas/lib/libcublas.so.11: undefined symbol: cublasLtHSHMatmulAlgoInit, version libcublasLt.so.11

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/home/ubuntu/.local/lib/python3.8/site-packages/torch/__init__.py", line 217, in <module>
    _load_global_deps()
  File "/home/ubuntu/.local/lib/python3.8/site-packages/torch/__init__.py", line 178, in _load_global_deps
    _preload_cuda_deps()
  File "/home/ubuntu/.local/lib/python3.8/site-packages/torch/__init__.py", line 158, in _preload_cuda_deps
    ctypes.CDLL(cublas_path)
  File "/usr/lib/python3.8/ctypes/__init__.py", line 373, in __init__
    self._handle = _dlopen(self._name, mode)
OSError: /home/ubuntu/.local/lib/python3.8/site-packages/nvidia/cublas/lib/libcublas.so.11: undefined symbol: cublasLtHSHMatmulAlgoInit, version libcublasLt.so.11

@atalman
Copy link
Contributor
atalman commented Dec 19, 2022

Confirmed conflict with existing cuda installation on that image.
Running following command fixes the issue:

export LD_LIBRARY_PATH=/opt/amazon/efa/lib:/opt/amazon/openmpi/lib:/usr/local/cuda/extras/CUPTI/lib64:/usr/local/lib:/usr/lib:
python3 -c "import torch"

@malfet
Copy link
Contributor Author 10000
malfet commented Dec 19, 2022

@atalman thank you for your investigation. My point is that installation experience for pip install torch and pip install --user torch --extra-index-url https://download.pytorch.org/whl/cu117/ should be identical with the only difference that one is using CUDA dependencies from other package rather than the same one.

pip install --user torch --extra-index-url https://download.pytorch.org/whl/cu117/ results in installation working out of the box, but pip install torch fails and imo this is something that needs to be fixed

@atalman atalman closed this as completed Dec 19, 2022
@atalman atalman reopened this Dec 19, 2022
@mergian
Copy link
Contributor
mergian commented Dec 20, 2022

This is related to #88882.

@vadimkantorov
Copy link
Contributor
vadimkantorov commented Jan 16, 2023

This might also be a good idea to have such automated smoke installs testing on several common machines/AMIs on ec2 / other clouds (at least when a new pytorch is released)

This might also suggest if pytorch needs to change suggested installation commands from pip3 to python3 -mpip / python -mpip which is now is an officially recommended way of running pip if I understand well

@malfet malfet added this to the 2.0.0 milestone Feb 17, 2023
@malfet malfet added triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module and removed triage review labels Feb 17, 2023
@malfet
Copy link
Contributor Author
malfet commented Feb 17, 2023

@atalman, @syed-ahmed: this still happens with RC1:

python3 -mpip install --user torch --extra-index-url https://download.pytorch.org/whl/test/cu117_pypi_cudnn/
$ python3 -c "import torch"
Traceback (most recent call last):
  File "/home/ubuntu/.local/lib/python3.8/site-packages/torch/__init__.py", line 168, in _load_global_deps
    ctypes.CDLL(lib_path, mode=ctypes.RTLD_GLOBAL)
  File "/usr/lib/python3.8/ctypes/__init__.py", line 373, in __init__
    self._handle = _dlopen(self._name, mode)
OSError: /home/ubuntu/.local/lib/python3.8/site-packages/torch/lib/../../nvidia/cublas/lib/libcublas.so.11: undefined symbol: cublasLtHSHMatmulAlgoInit, version libcublasLt.so.11

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/home/ubuntu/.local/lib/python3.8/site-packages/torch/__init__.py", line 228, in <module>
    _load_global_deps()
  File "/home/ubuntu/.local/lib/python3.8/site-packages/torch/__init__.py", line 189, in _load_global_deps
    _preload_cuda_deps(lib_folder, lib_name)
  File "/home/ubuntu/.local/lib/python3.8/site-packages/torch/__init__.py", line 155, in _preload_cuda_deps
    ctypes.CDLL(lib_path)
  File "/usr/lib/python3.8/ctypes/__init__.py", line 373, in __init__
    self._handle = _dlopen(self._name, mode)
OSError: /home/ubuntu/.local/lib/python3.8/site-packages/nvidia/cublas/lib/libcublas.so.11: undefined symbol: cublasLtHSHMatmulAlgoInit, version libcublasLt.so.11

@syed-ahmed
Copy link
Collaborator

@malfet I wasn't able to reproduce the error from your repro in the comment. Can you check if your LD_LIBRARY_PATH is set and it's not picking up a different libcublas?

The RUNPATH looks good to me, ldd is showing cublasLT dependency:

root@syed:~/.local/lib/python3.8/site-packages/nvidia/cublas/lib# readelf -d /root/.local/lib/python3.8/site-packages/nvidia/cublas/lib/libcublas.so.11 | grep path
 0x000000000000001d (RUNPATH)            Library runpath: [$ORIGIN]
root@syed:~/.local/lib/python3.8/site-packages/nvidia/cublas/lib# ldd /root/.local/lib/python3.8/site-packages/nvidia/cublas/lib/libcublas.so.11
   linux-vdso.so.1 (0x00007ffed1d75000)
   libcublasLt.so.11 => /root/.local/lib/python3.8/site-packages/nvidia/cublas/lib/libcublasLt.so.11 (0x00007f1afd9d7000)
   librt.so.1 => /usr/lib/x86_64-linux-gnu/librt.so.1 (0x00007f1afd9bf000)
   libpthread.so.0 => /usr/lib/x86_64-linux-gnu/libpthread.so.0 (0x00007f1afd99c000)
   libdl.so.2 => /usr/lib/x86_64-linux-gnu/libdl.so.2 (0x00007f1afd996000)
   libm.so.6 => /usr/lib/x86_64-linux-gnu/libm.so.6 (0x00007f1afd847000)
   libgcc_s.so.1 => /usr/lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007f1afd82c000)
   libc.so.6 => /usr/lib/x86_64-linux-gnu/libc.so.6 (0x00007f1afd638000)
   /lib64/ld-linux-x86-64.so.2 (0x00007f1b1abd8000)

I also tried just opening libcublas with ctypes and running it with LD_DEBUG as follows:

LD_DEBUG=libs LD_DEBUG_OUTPUT=out.txt python3 -c 'import ctypes; ctypes.CDLL("/root/.local/lib/python3.8/site-packages/nvidia/cublas/lib/libcublas.so.11", mode=ctypes.RTLD_GLOBAL)'

I do see libcublasLT gets loaded by libcublas:

openat(AT_FDCWD, "/root/.local/lib/python3.8/site-packages/nvidia/cublas/lib/libcublasLt.so.11", O_RDONLY|O_CLOEXEC) = 3
read(3, "\177ELF\2\1\1\3\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\20 ,\0\0\0\0\0"..., 832) = 832
fstat(3, {st_mode=S_IFREG|0644, st_size=332762424, ...}) = 0

So trying to understand what in your environment is different.

@ptrblck
8000 Copy link
Collaborator
ptrblck commented Feb 17, 2023

Do we have any docker container in CI or public which reproduces it?

@weiwangmeta weiwangmeta changed the title torch-1.13.1 can not be installed on Ubuntu 20.04 torch-2.0.0-rc1 and torch-1.13.1 can not be installed on Ubuntu 20.04 Feb 23, 2023
@weiwangmeta
Copy link
Contributor
weiwangmeta commented Feb 23, 2023

python3 -c "import torch"

I cannot reproduce this either.

(fix_91067) ubuntu@:~$ python3 -mpip install --user torch --extra-index-url https://download.pytorch.org/whl/test/cu117_pypi_cudnn/ 
Looking in indexes: https://pypi.org/simple, https://download.pytorch.org/whl/test/cu117_pypi_cudnn/               
Collecting torch                                                                                                                                                                                Downloading https://download.pytorch.org/whl/test/cu117_pypi_cudnn/torch-2.0.0%2Bcu117.with.pypi.cudnn-cp310-cp310-linux_x86_64.whl (621.6 MB)                                                   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 621.6/621.6 MB 2.9 MB/s eta 0:00:00
Collecting nvidia-cuda-nvrtc-cu11==11.7.99                                                                                                                                                    
  Using cached nvidia_cuda_nvrtc_cu11-11.7.99-2-py3-none-manylinux1_x86_64.whl (21.0 MB)
Collecting nvidia-cuda-runtime-cu11==11.7.99                                                                                                                                                  
  Using cached nvidia_cuda_runtime_cu11-11.7.99-py3-none-manylinux1_x86_64.whl (849 kB)
Collecting networkx
  Using cached networkx-3.0-py3-none-any.whl (2.0 MB)
Collecting nvidia-cudnn-cu11==8.5.0.96
  Using cached nvidia_cudnn_cu11-8.5.0.96-2-py3-none-manylinux1_x86_64.whl (557.1 MB)
Collecting nvidia-curand-cu11==10.2.10.91
  Downloading nvidia_curand_cu11-10.2.10.91-py3-none-manylinux1_x86_64.whl (54.6 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 54.6/54.6 MB 37.1 MB/s eta 0:00:00
Collecting nvidia-cusolver-cu11==11.4.0.1
  Downloading nvidia_cusolver_cu11-11.4.0.1-2-py3-none-manylinux1_x86_64.whl (102.6 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 102.6/102.6 MB 20.3 MB/s eta 0:00:00
Collecting typing-extensions
  Downloading typing_extensions-4.5.0-py3-none-any.whl (27 kB)
Collecting nvidia-nvtx-cu11==11.7.91
  Downloading nvidia_nvtx_cu11-11.7.91-py3-none-manylinux1_x86_64.whl (98 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 98.6/98.6 kB 25.7 MB/s eta 0:00:00
Collecting sympy
  Downloading https://download.pytorch.org/whl/test/sympy-1.11.1-py3-none-any.whl (6.5 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 6.5/6.5 MB 74.2 MB/s eta 0:00:00
Collecting nvidia-cuda-cupti-cu11==11.7.101
  Downloading nvidia_cuda_cupti_cu11-11.7.101-py3-none-manylinux1_x86_64.whl (11.8 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 11.8/11.8 MB 120.8 MB/s eta 0:00:00
Collecting nvidia-cufft-cu11==10.9.0.58
  Downloading nvidia_cufft_cu11-10.9.0.58-py3-none-manylinux1_x86_64.whl (168.4 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 168.4/168.4 MB 13.2 MB/s eta 0:00:00
Collecting nvidia-nccl-cu11==2.14.3
  Downloading nvidia_nccl_cu11-2.14.3-py3-none-manylinux1_x86_64.whl (177.1 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 177.1/177.1 MB 6.1 MB/s eta 0:00:00
Collecting nvidia-cublas-cu11==11.10.3.66
  Using cached nvidia_cublas_cu11-11.10.3.66-py3-none-manylinux1_x86_64.whl (317.1 MB)
Collecting nvidia-cusparse-cu11==11.7.4.91
  Downloading nvidia_cusparse_cu11-11.7.4.91-py3-none-manylinux1_x86_64.whl (173.2 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 173.2/173.2 MB 12.5 MB/s eta 0:00:00
Collecting filelock
  Downloading https://download.pytorch.org/whl/test/filelock-3.9.0-py3-none-any.whl (9.7 kB)
Requirement already satisfied: wheel in ./anaconda3/envs/fix_91067/lib/python3.10/site-packages (from nvidia-cublas-cu11==11.10.3.66->torch) (0.38.4)
Requirement already satisfied: setuptools in ./anaconda3/envs/fix_91067/lib/python3.10/site-packages (from nvidia-cublas-cu11==11.10.3.66->torch) (65.6.3)
Collecting mpmath>=0.19
  Downloading https://download.pytorch.org/whl/test/mpmath-1.2.1-py3-none-any.whl (532 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 532.6/532.6 kB 67.4 MB/s eta 0:00:00
Installing collected packages: mpmath, typing-extensions, sympy, nvidia-nvtx-cu11, nvidia-nccl-cu11, nvidia-cusparse-cu11, nvidia-curand-cu11, nvidia-cufft-cu11, nvidia-cuda-runtime-cu11, nv
idia-cuda-nvrtc-cu11, nvidia-cuda-cupti-cu11, nvidia-cublas-cu11, networkx, filelock, nvidia-cusolver-cu11, nvidia-cudnn-cu11, torch

(fix_91067) ubuntu@:~$ python3 -c "import torch" 
(fix_91067) ubuntu@:~$ cat /etc/os-release
NAME="Ubuntu"
VERSION="20.04.4 LTS (Focal Fossa)"
ID=ubuntu
ID_LIKE=debian
PRETTY_NAME="Ubuntu 20.04.4 LTS"
VERSION_ID="20.04"
HOME_URL="https://www.ubuntu.com/"
SUPPORT_URL="https://help.ubuntu.com/"
BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
VERSION_CODENAME=focal

@atalman
Copy link
Contributor
atalman commented Feb 27, 2023

Closing this as it is resolved in nightly, by this PR: #95094

Here is ldd for libtorch_cuda showing libcublasLt being loaded from same place as libcublas:

ubuntu@ip-172-31-19-42:~/.local/lib/python3.8/site-packages/torch/lib$ ldd libtorch_cuda.so 
	linux-vdso.so.1 (0x00007fff854ee000)
	libc10_cuda.so => /home/ubuntu/.local/lib/python3.8/site-packages/torch/lib/./libc10_cuda.so (0x00007f17f4080000)
	libcudart.so.11.0 => /home/ubuntu/.local/lib/python3.8/site-packages/torch/lib/./../../nvidia/cuda_runtime/lib/libcudart.so.11.0 (0x00007f17f3ddb000)
	libcusparse.so.11 => /home/ubuntu/.local/lib/python3.8/site-packages/torch/lib/./../../nvidia/cusparse/lib/libcusparse.so.11 (0x00007f17e573b000)
	libcurand.so.10 => /home/ubuntu/.local/lib/python3.8/site-packages/torch/lib/./../../nvidia/curand/lib/libcurand.so.10 (0x00007f17dfb41000)
	libcufft.so.10 => /home/ubuntu/.local/lib/python3.8/site-packages/torch/lib/./../../nvidia/cufft/lib/libcufft.so.10 (0x00007f17cec66000)
	libnvToolsExt.so.1 => /home/ubuntu/.local/lib/python3.8/site-packages/torch/lib/./../../nvidia/nvtx/lib/libnvToolsExt.so.1 (0x00007f17cea5c000)
	libcudnn.so.8 => /home/ubuntu/.local/lib/python3.8/site-packages/torch/lib/./../../nvidia/cudnn/lib/libcudnn.so.8 (0x00007f17ce836000)
	libnccl.so.2 => /home/ubuntu/.local/lib/python3.8/site-packages/torch/lib/./../../nvidia/nccl/lib/libnccl.so.2 (0x00007f17c09bb000)
	libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007f17c0977000)
	libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007f17c0971000)
	librt.so.1 => /lib/x86_64-linux-gnu/librt.so.1 (0x00007f17c0967000)
	libc10.so => /home/ubuntu/.local/lib/python3.8/site-packages/torch/lib/./libc10.so (0x00007f17c08a8000)
	libtorch_cpu.so => /home/ubuntu/.local/lib/python3.8/site-packages/torch/lib/./libtorch_cpu.so (0x00007f17a7928000)
	libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007f17a77d7000)
	libcublas.so.11 => /home/ubuntu/.local/lib/python3.8/site-packages/torch/lib/./../../nvidia/cublas/lib/libcublas.so.11 (0x00007f179e579000)
	libcublasLt.so.11 => /home/ubuntu/.local/lib/python3.8/site-packages/torch/lib/./../../nvidia/cublas/lib/libcublasLt.so.11 (0x00007f178a5d8000)
	libstdc++.so.6 => /lib/x86_64-linux-gnu/libstdc++.so.6 (0x00007f178a3f6000)
	libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007f178a3db000)
	libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f178a1e9000)
	/lib64/ld-linux-x86-64.so.2 (0x00007f181a6eb000)
	libgomp-a34b3233.so.1 => /home/ubuntu/.local/lib/python3.8/site-packages/torch/lib/./libgomp-a34b3233.so.1 (0x00007f1789fbd000)
	libcupti.so.11.7 => /home/ubuntu/.local/lib/python3.8/site-packages/torch/lib/./../../nvidia/cuda_cupti/lib/libcupti.so.11.7 (0x00007f17896c2000)
	libutil.so.1 => /lib/x86_64-linux-gnu/libutil.so.1 (0x00007f17896bd000)

import torch smoke test also works:

ubuntu@ip-172-31-19-42:~$ python3
Python 3.8.10 (default, Nov 14 2022, 12:59:47) 
[GCC 9.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import torch
>>> exit()

@atalman
Copy link
Contributor
atalman commented Sep 3, 2024

Confirmed for 2.4.1, stock ubuntu 20.04:

python3 -mpip install --user torch --extra-index-url https://download.pytorch.org/whl/test/cu121 --force-reinstall
Looking in indexes: https://pypi.org/simple, https://download.pytorch.org/whl/test/cu121
Collecting torch
  Downloading https://download.pytorch.org/whl/test/cu121/torch-2.4.1%2Bcu121-cp38-cp38-linux_x86_64.whl (798.9 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 798.9/798.9 MB 19.9 MB/s eta 0:00:00
Collecting filelock (from torch)
  Downloading filelock-3.15.4-py3-none-any.whl.metadata (2.9 kB)
Collecting typing-extensions>=4.8.0 (from torch)
  Downloading typing_extensions-4.12.2-py3-none-any.whl.metadata (3.0 kB)
Collecting sympy (from torch)
  Downloading sympy-1.13.2-py3-none-any.whl.metadata (12 kB)
Collecting networkx (from torch)
  Using cached https://download.pytorch.org/whl/test/networkx-3.2.1-py3-none-any.whl (1.6 MB)
Collecting jinja2 (from torch)
  Downloading jinja2-3.1.4-py3-none-any.whl.metadata (2.6 kB)
Collecting fsspec (from torch)
  Downloading fsspec-2024.6.1-py3-none-any.whl.metadata (11 kB)
Collecting nvidia-cuda-nvrtc-cu12==12.1.105 (from torch)
  Downloading https://download.pytorch.org/whl/test/cu121/nvidia_cuda_nvrtc_cu12-12.1.105-py3-none-manylinux1_x86_64.whl (23.7 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 23.7/23.7 MB 39.8 MB/s eta 0:00:00
Collecting nvidia-cuda-runtime-cu12==12.1.105 (from torch)
  Downloading https://download.pytorch.org/whl/test/cu121/nvidia_cuda_runtime_cu12-12.1.105-py3-none-manylinux1_x86_64.whl (823 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 823.6/823.6 kB 48.5 MB/s eta 0:00:00
Collecting nvidia-cuda-cupti-cu12==12.1.105 (from torch)
  Downloading https://download.pytorch.org/whl/test/cu121/nvidia_cuda_cupti_cu12-12.1.105-py3-none-manylinux1_x86_64.whl (14.1 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 14.1/14.1 MB 108.4 MB/s eta 0:00:00
Collecting nvidia-cudnn-cu12==9.1.0.70 (from torch)
  Downloading https://download.pytorch.org/whl/test/cu121/nvidia_cudnn_cu12-9.1.0.70-py3-none-manylinux2014_x86_64.whl (664.8 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 664.8/664.8 MB 34.1 MB/s eta 0:00:00
Collecting nvidia-cublas-cu12==12.1.3.1 (from torch)
  Downloading https://download.pytorch.org/whl/test/cu121/nvidia_cublas_cu12-12.1.3.1-py3-none-manylinux1_x86_64.whl (410.6 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 410.6/410.6 MB 46.7 MB/s eta 0:00:00
Collecting nvidia-cufft-cu12==11.0.2.54 (from torch)
  Downloading https://download.pytorch.org/whl/test/cu121/nvidia_cufft_cu12-11.0.2.54-py3-none-manylinux1_x86_64.whl (121.6 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 121.6/121.6 MB 47.6 MB/s eta 0:00:00
Collecting nvidia-curand-cu12==10.3.2.106 (from torch)
  Downloading https://download.pytorch.org/whl/test/cu121/nvidia_curand_cu12-10.3.2.106-py3-none-manylinux1_x86_64.whl (56.5 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 56.5/56.5 MB 42.7 MB/s eta 0:00:00
Collecting nvidia-cusolver-cu12==11.4.5.107 (from torch)
  Downloading https://download.pytorch.org/whl/test/cu121/nvidia_cusolver_cu12-11.4.5.107-py3-none-manylinux1_x86_64.whl (124.2 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 124.2/124.2 MB 51.2 MB/s eta 0:00:00
Collecting nvidia-cusparse-cu12==12.1.0.106 (from torch)
  Downloading https://download.pytorch.org/whl/test/cu121/nvidia_cusparse_cu12-12.1.0.106-py3-none-manylinux1_x86_64.whl (196.0 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 196.0/196.0 MB 59.7 MB/s eta 0:00:00
Collecting nvidia-nccl-cu12==2.20.5 (from torch)
  Downloading https://download.pytorch.org/whl/test/cu121/nvidia_nccl_cu12-2.20.5-py3-none-manylinux2014_x86_64.whl (176.2 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 176.2/176.2 MB 51.8 MB/s eta 0:00:00
Collecting nvidia-nvtx-cu12==12.1.105 (from torch)
  Downloading https://download.pytorch.org/whl/test/cu121/nvidia_nvtx_cu12-12.1.105-py3-none-manylinux1_x86_64.whl (99 kB)
Collecting triton==3.0.0 (from torch)
  Downloading https://download.pytorch.org/whl/test/triton-3.0.0-1-cp38-cp38-manylinux2014_x86_64.manylinux_2_17_x86_64.whl (209.4 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 209.4/209.4 MB 46.1 MB/s eta 0:00:00
Collecting nvidia-nvjitlink-cu12 (from nvidia-cusolver-cu12==11.4.5.107->torch)
  Downloading nvidia_nvjitlink_cu12-12.6.68-py3-none-manylinux2014_x86_64.whl.metadata (1.5 kB)
Collecting MarkupSafe>=2.0 (from jinja2->torch)
  Using cached https://download.pytorch.org/whl/test/MarkupSafe-2.1.5-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (26 kB)
INFO: pip is looking at multiple versions of networkx to determine which version is compatible with other requirements. This could take a while.
Collecting networkx (from torch)
  Downloading networkx-3.1-py3-none-any.whl.metadata (5.3 kB)
Collecting mpmath<1.4,>=1.1.0 (from sympy->torch)
  Using cached https://download.pytorch.org/whl/test/mpmath-1.3.0-py3-none-any.whl (536 kB)
Downloading typing_extensions-4.12.2-py3-none-any.whl (37 kB)
Downloading filelock-3.15.4-py3-none-any.whl (16 kB)
Downloading fsspec-2024.6.1-py3-none-any.whl (177 kB)
Downloading jinja2-3.1.4-py3-none-any.whl (133 kB)
Downloading networkx-3.1-py3-none-any.whl (2.1 MB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 2.1/2.1 MB 104.6 MB/s eta 0:00:00
Downloading sympy-1.13.2-py3-none-any.whl (6.2 MB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 6.2/6.2 MB 165.3 MB/s eta 0:00:00
Downloading nvidia_nvjitlink_cu12-12.6.68-py3-none-manylinux2014_x86_64.whl (19.7 MB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 19.7/19.7 MB 151.1 MB/s eta 0:00:00
Installing collected packages: mpmath, typing-extensions, sympy, nvidia-nvtx-cu12, nvidia-nvjitlink-cu12, nvidia-nccl-cu12, nvidia-curand-cu12, nvidia-cufft-cu12, nvidia-cuda-runtime-cu12, nvidia-cuda-nvrtc-cu12, nvidia-cuda-cupti-cu12, nvidia-cublas-cu12, networkx, MarkupSafe, fsspec, filelock, triton, nvidia-cusparse-cu12, nvidia-cudnn-cu12, jinja2, nvidia-cusolver-cu12, torch
  WARNING: The script isympy is installed in '/root/.local/bin' which is not on PATH.
  Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location.
  WARNING: The scripts proton and proton-viewer are installed in '/root/.local/bin' which is not on PATH.
  Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location.
  WARNING: The scripts convert-caffe2-to-onnx, convert-onnx-to-caffe2 and torchrun are installed in '/root/.local/bin' which is not on PATH.
  Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location.
Successfully installed MarkupSafe-2.1.5 filelock-3.15.4 fsspec-2024.6.1 jinja2-3.1.4 mpmath-1.3.0 networkx-3.1 nvidia-cublas-cu12-12.1.3.1 nvidia-cuda-cupti-cu12-12.1.105 nvidia-cuda-nvrtc-cu12-12.1.105 nvidia-cuda-runtime-cu12-12.1.105 nvidia-cudnn-cu12-9.1.0.70 nvidia-cufft-cu12-11.0.2.54 nvidia-curand-cu12-10.3.2.106 nvidia-cusolver-cu12-11.4.5.107 nvidia-cusparse-cu12-12.1.0.106 nvidia-nccl-cu12-2.20.5 nvidia-nvjitlink-cu12-12.6.68 nvidia-nvtx-cu12-12.1.105 sympy-1.13.2 torch-2.4.1+cu121 triton-3.0.0 typing-extensions-4.12.2
WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager, possibly rendering your system unusable.It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv. Use the --root-user-action option if you know what you are doing and want to suppress this warning.
(py38) root@9326773222e3:~/miniconda3/bin# python --version
Python 3.8.19
(py38) root@9326773222e3:~/miniconda3/bin# python3
Python 3.8.19 (default, Mar 20 2024, 19:58:24) 
[GCC 11.2.0] :: Anaconda, Inc. on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import torch
>>> torch.__version__
'2.4.1+cu121'
>>> exit()

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
high priority module: binaries Anything related to official binaries that we release to users triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module
Projects
None yet
Development

No branches or pull requests

7 participants
0