10000 Debugging CI Failures without SSH Access · pytorch/pytorch Wiki · GitHub
[go: up one dir, main page]

Skip to content

Debugging CI Failures without SSH Access

clee2000 edited this page Aug 19, 2024 · 2 revisions

Debugging without SSH Access

Linux CPU job

  1. Download docker on an x86 machine.
  2. In the CI job, find the step titled “Use following to pull public copy of the image”. It will have a command to pull the docker image. Pull and run the docker image (ex docker run –rm -it ghcr.io/pytorch/ci-image:pytorch-linux-focal-cuda12.1-cudnn9-py3-gcc9-93520d5082026249ce8ae0413d61e4891366a9df). The ghcr containers should be public, but firewalls and VPNs might result in permissions issues.
  3. Find the wheel for your job: go to the HUD page build your commit. Search for “Expand to see all artifacts” and search for the build that corresponds to the test. This should also be publicly available.
  4. Inside your docker container in jenkins folder (this should be home folder), download the build artifact link and unzip. Install the wheel inside the dist folder using pip.
  5. Clone pytorch and check out the corresponding sha (can be found in the bottom of the “Checkout PyTorch” step in the CI job).

Notes: Tests are usually run through pytest <test file>.py -k <test name> or python <test file>.py -k <test name>

Additional resources

Clone this wiki locally
0