8000 Runnings SentenceTransformer encoding step causes Docker containers on Mac (Silicon) to crash with code 139 · Issue #111695 · pytorch/pytorch · GitHub
[go: up one dir, main page]

Skip to content
Runnings SentenceTransformer encoding step causes Docker containers on Mac (Silicon) to crash with code 139 #111695
@sabaimran

Description

@sabaimran

🐛 Describe the bug

Hi! Hopefully there isn't a similar issue already open. I couldn't find one after a search through the issues list. Feel free to mark as duplicate/close if it already exists.

I've created this repository with a minimal setup to reproduce the error: https://github.com/sabaimran/repro-torch-bug. You just have to clone it and run docker-compose up to see the error. Basically it runs the script below in a minimal Docker container:

from typing import List
import torch
from langchain.embeddings import HuggingFaceEmbeddings

class EmbeddingsModel:
    def __init__(self):
        self.model_name = "sentence-transformers/multi-qa-MiniLM-L6-cos-v1"
        encode_kwargs = {"normalize_embeddings": True}

        if torch.cuda.is_available():
            # Use CUDA GPU
            device = torch.device("cuda:0")
        elif torch.backends.mps.is_available():
            # Use Apple M1 Metal Acceleration
            device = torch.device("mps")
        else:
            device = torch.device("cpu")

        self.device = device
        model_kwargs = {"device": device}
        self.embeddings_model = HuggingFaceEmbeddings(
            model_name=self.model_name, encode_kwargs=encode_kwargs, model_kwargs=model_kwargs
        )

    def embed_documents(self, docs: List[str]):
        print(f"Using device: {self.device} to embed {len(docs)} documents")
        return self.embeddings_model.embed_documents(docs)

model = EmbeddingsModel()
embeddings = model.embed_documents(["this is a document", "so is this"])
print(f"Created embeddings of length {len(embeddings)}")

If you run this code inside of a Docker container (with the appropriate dependencies), it will fail with exit code 139.

Pinning the torch package to 2.0.1 circumvents the error. See this other relevant issue: docker/for-mac#7016

Versions

Collecting environment information...
PyTorch version: 2.1.0
Is debug build: False
CUDA used to build PyTorch: None
ROCM used to build PyTorch: N/A

OS: macOS 13.2.1 (arm64)
GCC version: Could not collect
Clang version: 14.0.3 (clang-1403.0.22.14.1)
CMake version: version 3.26.4
Libc version: N/A

Python version: 3.11.4 (main, Jul 10 2023, 18:52:37) [Clang 14.0.3 (clang-1403.0.22.14.1)] (64-bit runtime)
Python platform: macOS-13.2.1-arm64-arm-64bit
Is CUDA available: False
CUDA runtime version: No CUDA
CUDA_MODULE_LOADING set to: N/A
GPU models and configuration: No CUDA
Nvidia driver version: No CUDA
cuDNN version: No CUDA
HIP runtime version: N/A
MIOpen runtime version: N/A
Is XNNPACK available: True

CPU:
Apple M2 Pro

Versions of relevant libraries:
[pip3] mypy-extensions==1.0.0
[pip3] numpy==1.26.1
[pip3] torch==2.1.0
[pip3] torchvision==0.16.0
[conda] Could not collect

cc @ezyang @gchanan @zou3519 @kadeng @gujinghui @PenghuiCheng @XiaobingSuper @jianyuh @jgong5 @mingfeima @sanchitintel @ashokei @jingxu10 @min-jean-cho @yanbing-j @Guobing-Chen @Xia-Weiwen @malfet @snadampal @albanD

Metadata

Metadata

Assignees

Labels

high prioritymodule: armRelated to ARM architectures builds of PyTorch. Includes Apple M1module: crashProblem manifests as a hard crash, as opposed to a RuntimeErrormodule: mkldnnRelated to Intel IDEEP or oneDNN (a.k.a. mkldnn) integrationmodule: regressionIt used to work, and now it doesn'ttriagedThis issue has been looked at a team member, and triaged and prioritized into an appropriate module

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions

    0