8000 [ML] Upgrade to Pytorch 2.1.2 and zlib 1.2.13 by edsavage · Pull Request #2588 · elastic/ml-cpp · GitHub
[go: up one dir, main page]

Skip to content

[ML] Upgrade to Pytorch 2.1.2 and zlib 1.2.13 #2588

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 17 commits into from
Jan 10, 2024
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
Update build instructions for linux and windows
  • Loading branch information
edsavage committed Oct 24, 2023
commit 6f7ca5cc1c5fd796f72aadc430e904361a17f579
18 changes: 12 additions & 6 deletions build-setup/linux.md
Original file line number Diff line number Diff line change
Expand Up @@ -395,26 +395,32 @@ git submodule update --init --recursive
git config --global --add safe.directory `pwd`
```

IPEX expects that PyTorch build directory contains a file `build-version`, which contains the PyTorch version string. This file is not created by the PyTorch build process, so we need to create it manually:
IPEX expects that PyTorch build directory contains a file `build-hash`, which contains the PyTorch git revision. This file is not created by the PyTorch build process, so we need to create it manually:
```bash
echo "2.1.0+cpu\n" > ${PYTORCH_SRC_DIR}/torch/build-version
(cd ${PYTORCH_SRC_DIR}/torch/ && git rev-parse HEAD > build-hash)
```

This assumes that you have cloned PyTorch in the directory `${PYTORCH_SRC_DIR}` in the step above. Make sure that this path is correct.

Building IPEX requires a lot of memory. To reduce this requirement, we can patch the IPEX build system to lower the number of parallel processes. We use `sed` to replace the call to `multiprocessing.cpu_count()` with a constant `1` in the file `setup.py`:
Building IPEX requires a lot of memory. To reduce this requirement, we can patch the MAX_JOBS environment variable to lower the number of parallel processes:
```bash
sed -i 's/multiprocessing.cpu_count()/1/g' setup.py
export MAX_JOBS=1
```

IPEX expects that the `blas-devel` library package be installed:
```bash
yum install blas-devel.x86_64
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs some due diligence. The library that gets installed here is libblas.so.3. Does IPEX end up requiring that at runtime? If so then that's a problem as we cannot guarantee that it will be present on end user systems.

Or if IPEX doesn't need to link libblas at runtime, why is it needed at build time? Are there environment variables that can be set to say not to use this?

```

Finally, we can build IPEX:
```bash
export CC=/usr/local/gcc103/bin/gcc
export TORCH_VERSION="v2.1.0"
export TORCH_IPEX_VERSION="2.1.0+cpu"
export IPEX_VERSION="2.1.0+cpu"
export LIBTORCH_PATH=/usr/src/pytorch/torch
/usr/local/gcc103/bin/python3.10 -m pip install -r requirements.txt
/usr/local/gcc103/bin/python3.10 setup.py clean
/usr/local/gcc103/bin/python3.10 setup.py build_clib ${PYTORCH_SRC_DIR}/torch
/usr/local/gcc103/bin/python3.10 setup.py develop
cp build/Release/packages/intel_extension_for_pytorch/lib/libintel-ext-pt-cpu.so /usr/local/gcc103/lib
```

Expand Down
10 changes: 5 additions & 5 deletions build-setup/windows.md
Original file line number Diff line number Diff line change
Expand Up @@ -73,17 +73,17 @@ Install it mainly using the default options _except_ on the "Install Options" di

Whilst it is possible to download a pre-built version of `zlib1.dll`, for consistency we want one that links against the Visual Studio 2019 C runtime library. Therefore it is necessary to build zlib from source.

Download the source code from <http://zlib.net/> - the file is called `zlib1212.zip`. Unzip this file under `C:\tools`, so that you end up with a directory called `C:\tools\zlib-1.2.12`.
Download the source code from <http://zlib.net/> - the file is called `zlib1213.zip`. Unzip this file under `C:\tools`, so that you end up with a directory called `C:\tools\zlib-1.2.13`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This means 3rd_party/licenses/zlib-INFO.csv needs updating too.


To build, start a command prompt using Start Menu -&gt; Apps -&gt; Visual Studio 2019 -&gt; x64 Native Tools Command Prompt for VS 2019, then in it type:

```
cd \tools\zlib-1.2.12
cd \tools\zlib-1.2.13
nmake -f win32/Makefile.msc LOC="-D_WIN32_WINNT=0x0601"
nmake -f win32/Makefile.msc test
```

All the build output will end up in the top level `C:\tools\zlib-1.2.12` directory. Once the build is complete, copy `zlib1.dll` and `minigzip.exe` to `C:\usr\local\bin`. Copy `zlib.lib` and `zdll.lib` to `C:\usr\local\lib`. And copy `zlib.h` and `zconf.h` to `C:\usr\local\include`.
All the build output will end up in the top level `C:\tools\zlib-1.2.13` directory. Once the build is complete, copy `zlib1.dll` and `minigzip.exe` to `C:\usr\local\bin`. Copy `zlib.lib` and `zdll.lib` to `C:\usr\local\lib`. And copy `zlib.h` and `zconf.h` to `C:\usr\local\include`.

### libxml2

Expand Down Expand Up @@ -147,8 +147,8 @@ Start a command prompt using Start Menu -&gt; Apps -&gt; Visual Studio 2019 -&gt
```
cd \tools\boost_1_83_0
bootstrap.bat
b2 -j6 --layout=versioned --disable-icu --toolset=msvc-14.2 cxxflags="-std:c++17" linkflags="-std:c++17" --build-type=complete -sZLIB_INCLUDE="C:\tools\zlib-1.2.12" -sZLIB_LIBPATH="C:\tools\zlib-1.2.12" -sZLIB_NAME=zdll --without-context --without-coroutine --without-graph_parallel --without-mpi --without-python architecture=x86 address-model=64 optimization=speed inlining=full define=BOOST_MATH_NO_LONG_DOUBLE_MATH_FUNCTIONS define=BOOST_LOG_WITHOUT_DEBUG_OUTPUT define=BOOST_LOG_WITHOUT_EVENT_LOG define=BOOST_LOG_WITHOUT_SYSLOG define=BOOST_LOG_WITHOUT_IPC define=_WIN32_WINNT=0x0601
b2 install --prefix=C:\usr\local --layout=versioned --disable-icu --toolset=msvc-14.2 cxxflags="-std:c++17" linkflags="-std:c++17" --build-type=complete -sZLIB_INCLUDE="C:\tools\zlib-1.2.12" -sZLIB_LIBPATH="C:\tools\zlib-1.2.12" -sZLIB_NAME=zdll --without-context --without-coroutine --without-graph_parallel --without-mpi --without-python architecture=x86 address-model=64 optimization=speed inlining=full define=BOOST_MATH_NO_LONG_DOUBLE_MATH_FUNCTIONS define=BOOST_LOG_WITHOUT_DEBUG_OUTPUT define=BOOST_LOG_WITHOUT_EVENT_LOG define=BOOST_LOG_WITHOUT_SYSLOG define=BOOST_LOG_WITHOUT_IPC define=_WIN32_WINNT=0x0601
b2 -j6 --layout=versioned --disable-icu --toolset=msvc-14.2 cxxflags="-std:c++17" linkflags="-std:c++17" --build-type=complete -sZLIB_INCLUDE="C:\tools\zlib-1.2.13" -sZLIB_LIBPATH="C:\tools\zlib-1.2.13" -sZLIB_NAME=zdll --without-context --without-coroutine --without-graph_parallel --without-mpi --without-python architecture=x86 address-model=64 optimization=speed inlining=full define=BOOST_MATH_NO_LONG_DOUBLE_MATH_FUNCTIONS define=BOOST_LOG_WITHOUT_DEBUG_OUTPUT define=BOOST_LOG_WITHOUT_EVENT_LOG define=BOOST_LOG_WITHOUT_SYSLOG define=BOOST_LOG_WITHOUT_IPC define=_WIN32_WINNT=0x0601
b2 install --prefix=C:\usr\local --layout=versioned --disable-icu --toolset=msvc-14.2 cxxflags="-std:c++17" linkflags="-std:c++17" --build-type=complete -sZLIB_INCLUDE="C:\tools\zlib-1.2.13" -sZLIB_LIBPATH="C:\tools\zlib-1.2.13" -sZLIB_NAME=zdll --without-context --without-coroutine --without-graph_parallel --without-mpi --without-python architecture=x86 address-model=64 optimization=speed inlining=full define=BOOST_MATH_NO_LONG_DOUBLE_MATH_FUNCTIONS define=BOOST_LOG_WITHOUT_DEBUG_OUTPUT define=BOOST_LOG_WITHOUT_EVENT_LOG define=BOOST_LOG_WITHOUT_SYSLOG define=BOOST_LOG_WITHOUT_IPC define=_WIN32_WINNT=0x0601
```

The Boost headers and appropriate libraries should end up in `C:\usr\local\include` and `C:\usr\local\lib` respectively.
Expand Down
11 changes: 5 additions & 6 deletions dev-tools/docker/linux_image/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@ MAINTAINER David Roberts <dave.roberts@elastic.co>
# libffi is required for building Python
RUN \
rm /var/lib/rpm/__db.* && \
yum install -y bzip2 gcc gcc-c++ git libffi-devel make texinfo unzip wget which xz zip zlib-devel
yum install -y bzip2 gcc gcc-c++ git libffi-devel make texinfo unzip wget which xz zip zlib-devel blas-devel.x86_64
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This list is no longer in alphabetical order.

But also, please investigate my other comment about whether this is really needed, if so why, and is it causing a runtime complication.


# For compiling with hardening and optimisation
ENV CFLAGS "-g -O3 -fstack-protector -D_FORTIFY_SOURCE=2 -msse4.2 -mfpmath=sse"
Expand Down Expand Up @@ -174,13 +174,12 @@ RUN \
git config --global --add safe.directory `pwd` && \
/usr/local/bin/python3.10 -m pip install -r requirements.txt && \
/usr/local/bin/python3.10 setup.py clean && \
echo "2.1.0+cpu\n" > ${build_dir}/pytorch/torch/build-version && \
(cd ${build_dir}/pytorch/torch/ && git rev-parse HEAD > build-hash) && \
export CC=/usr/local/gcc103/bin/gcc && \
export TORCH_VERSION="v2.1.0" && \
export TORCH_IPEX_VERSION="2.1.0+cpu" && \
# building ipex is memory hungry, so we need to limit the number of jobs
sed -i 's/multiprocessing.cpu_count()/1/g' setup.py && \
/usr/local/bin/python3.10 setup.py build_clib ${build_dir}/pytorch/torch && \
export IPEX_VERSION="2.1.0+cpu" && \
export LIBTORCH_PATH=${build_dir}/pytorch/torch && \
/usr/local/bin/python3.10 setup.py develop && \
cd ${build_dir}/pytorch && \
mkdir /usr/local/gcc103/include/pytorch && \
cp -r torch/include/* /usr/local/gcc103/include/pytorch/ && \
Expand Down
0