8000 Request to cherrypick a fix into v1.13.1 (v1.8 has a CVE) · Issue #98115 · pytorch/pytorch · GitHub
[go: up one dir, main page]

Skip to content

Request to cherrypick a fix into v1.13.1 (v1.8 has a CVE) #98115

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
shahsmit1 opened this issue Apr 1, 2023 · 8 comments
Open

Request to cherrypick a fix into v1.13.1 (v1.8 has a CVE) #98115

shahsmit1 opened this issue Apr 1, 2023 · 8 comments
Labels
module: binaries Anything related to official binaries that we release to users triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module

Comments

@shahsmit1
Copy link
shahsmit1 commented Apr 1, 2023

🐛 Describe the bug

We ran into a CVE on v1.8, details at https://nvd.nist.gov/vuln/detail/CVE-2022-45907. The CVE got fixed in v1.13.1 but this version has a bazel bug which stops us from using it: #92096 (comment).

Since, 1.8 -> 2.0.0 is a major version upgrade (which is a risk for us), the request is to cherrypick the fix for the bazel issue to 1.13: Fix is #92122.

Versions

N/A

cc @seemethere @malfet

@skrajun
Copy link
skrajun commented Apr 2, 2023

Unfortunately it seems like there are also issues with torch 2.0 around a circular dependency with triton that was fixed recently triton-lang/triton#1374 - so we are also unable to upgrade to 2.0. Right now, there are no torch releases available that we can use that have the CVE fix, would really appreciate if #92122 was cherry picked on 1.13.1!

@shahsmit1 shahsmit1 changed the title Request to cherrypick a fix into v1.13 (v1.8 has a CVE) Request to cherrypick a fix into v1.13.1 (v1.8 has a CVE) Apr 3, 2023
@chakpak
Copy link
chakpak commented Apr 4, 2023

@malfet this is fast becoming a blocker for our workflow. I would imagine that other users of torch are also facing the similar issues due to the CVE and the bazel bug. An ETA would go long a way for us to set expectations for our teams. Thanks.

@malfet malfet added module: binaries Anything related to official binaries that we release to users triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module labels Apr 4, 2023
@malfet
Copy link
Contributor
malfet commented Apr 4, 2023

@shahsmit1 thank you very much for filing the issue. Please correct me if I'm wrong, but:

If above are correct, then it looks like you are running into #92096 because you are building from source. But in that case, it's extremely easy to apply the fix against release/1.13 branch and push it to some new branch. Is that the ask?

@chakpak
Copy link
chakpak commented Apr 4, 2023

@skrajun can describe the issue with #92096 as he ran into it. We would like to truly avoid building from source. It is a rabbit hole and we struggle to build our wheel and then ship to enterprise customers. We would rather use a pre-built binary and that is why were requesting a v1.13.2 with the cublas/cudnn folder fix.
v2.0.0 is too much of a change for us and non-starter due to backward compatibility.

@skrajun
Copy link
skrajun commented Apr 4, 2023

@malfet Thank you for taking a look!

At the moment there are no open CVEs against neither v1.13.1 nor v2.0.0

Right - we really want to be on v1.13.1 or try to directly go to v2.0.0 if it's safe to do so, but with 1.13.1 we are affected by
#92096 and with v2.0.0 we are affected by triton-lang/triton#1374.

#92096 does not affect any of the binaries shipped in v1.13.1

I'm not sure if this is true - when I build with 1.13.1 today I run into the issue mentioned in #92096. So I believe #92096 does affect the binaries shipped in v1.13.1. More evidence towards this would be that the 1.13.1 release seems to have been cut on Dec 15, 2022, but #92122 was merged Jan 23, 2023 (and I can't seem to find any backports). Finally, when building from source cherry-picking a799ace on top of release/1.13.1 does generate a diff so I get the feeling this fix isn't on 1.13.1. Thankfully, manually patching the 1.13.1 wheel with the fix seems to have resolved the issues around #92096 for us - but an official release would be really appreciated until we are able to upgrade to 2.0.0 (which is also blocked for the reason mentioned earlier).

@malfet
Copy link
Contributor
malfet commented Apr 4, 2023

Right - we really want to be on v1.13.1 or try to directly go to v2.0.0 if it's safe to do so, but with 1.13.1 we are affected by #92096 and with v2.0.0 we are affected by openai/triton#1374.

Triton dependency fix can be easily cherry-picked into the upcoming release. (And it looks like there is a demand for it)

#92096 does not affect any of the binaries shipped in v1.13.1
I'm not sure if this is true - when I build with 1.13.1 today I run into the issue mentioned in #92096.

Can you please share a reproducer when pip install or conda install command that references published torch-1.13.x binaries would result in an unusable package?

@emrebayramc
Copy link

I am facing the same problem with bazel + pytorch 1.13.1
I can not downgrade to pytorch 1.12.x(which I verified it doesnt have this issue) because it has a security issue which blocks SoC2 compliance.

The problem seems to be related to this -> #92122
so pytorch + bazel cannot find libcublas.so.11 because it is in another folder now.

Any plans to patch 1.13.1 with the fix?

@dududko
Copy link
dududko commented May 6, 2025

Faced the same issue with v1.13.1, found a workaround using #92122

You need to apply this patch the following way:

  1. Copy the content of a patch to patches/torch.patch

  2. add BUILD file to patches package

$ cat patches/BUILD.bazel
exports_files(
    srcs = glob(["*.patch"]),
    visibility = ["//visibility:public"],
)
  1. add pip.override to MODULE.bazel
pip.override(
    file = "torch-1.13.1-cp310-cp310-manylinux1_x86_64.whl",
    patch_strip = 1,
    patches = [
        "@//patches:torch.patch",
    ],
)

UPD1: probably will have to publish patched python wheel since patching takes significant amount of time (~1min)
UPD2: another option to use http_archive with build_file or build_file_content. If you use gazelle then also add # gazelle:resolve py to tell gazelle to use your target instead of the one from pypi.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
module: binaries Anything related to official binaries that we release to users triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module
Projects
None yet
Development

No branches or pull requests

6 participants
0