8000 Arm64 CI setup with TravisCI by rth · Pull Request #17996 · scikit-learn/scikit-learn · GitHub
[go: up one dir, main page]

Skip to content

Arm64 CI setup with TravisCI #17996

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 60 commits into from
Jul 31, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
60 commits
Select commit Hold shift + click to select a range
11193af
Arch64 CI setup with TravisCI
rth Jul 26, 2020
6a9a875
Fix some tests
rth Jul 26, 2020
83d1d15
Debug CPU arch
rth Jul 26, 2020
1c5068a
Try a different aarch key
rth Jul 26, 2020
85648f9
Iter
rth Jul 26, 2020
a319f78
Iter
rth Jul 26, 2020
962e7df
Install cython
rth Jul 26, 2020
ede0d17
Determine the CPU count
rth Jul 26, 2020
862a94b
Use pytest-xdist
rth Jul 26, 2020
9100c73
Improve travis bash scripts
rth Jul 26, 2020
35feb00
Pre-fetch dataset and skip failing GradientBoostingClassifier docstring
rth Jul 26, 2020
fd9a2ae
Install jq
rth Jul 26, 2020
0d1c629
Fix typo
rth Jul 26, 2020
048d4ae
More fixes for CI
rth Jul 26, 2020
323c28a
Better workaround for cache corruption issues
rth Jul 26, 2020
956824e
Fix tab/spaces
rth Jul 26, 2020
6d5b31f
Fix naming of platform.machine() on arm64
rth Jul 26, 2020
702ce16
Don't copy conftest
rth Jul 26, 2020
f9d55c1
Merge branch 'master' of github.com:scikit-learn/scikit-learn into aa…
ogrisel Jul 27, 2020
83a6e7a
Check whether using 16 CPUs would be fast enough
ogrisel Jul 27, 2020
62d456b
Check whether using 8 CPUs would be faster
ogrisel Jul 27, 2020
8e2086d
DEBUG disable pytest-xdist to time tests and see if they still freeze…
ogrisel Jul 27, 2020
21b3267
Fix syntax error
ogrisel Jul 27, 2020
23c8bd8
Enable pytest-xdist again
ogrisel Jul 27, 2020
830686b
DEBUG test only on a submodule
ogrisel Jul 28, 2020
d090f03
Do no run pytest at all.
ogrisel Jul 28, 2020
ce7d7b9
DEBUG try to trigger the freeze as fast as possible
ogrisel Jul 28, 2020
9c60c1e
DEBUG Minimal CI reproducer
ogrisel Jul 28, 2020
0a252a9
DEBUG stop loading other CI for nothing
ogrisel Jul 28, 2020
c5785dc
DEBUG more minimal setting
ogrisel Jul 29, 2020
29b454e
Merge branch 'master' into aarch-ci
ogrisel Jul 29, 2020
8c1680d
DEBUG no test collection at all
ogrisel Jul 29, 2020
53234b5
DEBUG remove pytest plugins
ogrisel Jul 29, 2020
55c2387
DEBUG
ogrisel Jul 29, 2020
df68fc0
DEBUG try to use faulthandler
ogrisel Jul 29, 2020
c9acb34
Make the test fail
ogrisel Jul 29, 2020
8ec694c
DEBUG
ogrisel Jul 30, 2020
21ab276
DEBUG
ogrisel Jul 30, 2020
2dd3b4a
DEBUG typo
ogrisel Jul 30, 2020
b2f0063
DEBUG trying to play with exit status
ogrisel Jul 30, 2020
3c15306
DEBUG remove final exit?
ogrisel Jul 30, 2020
56763f9
DEBUG trying to play with exit status
ogrisel Jul 30, 2020
fc12804
Try travis_terminate
ogrisel Jul 30, 2020
1c85f97
DEBUG this time it will work
ogrisel Jul 30, 2020
4aa878c
DEBUG try to remove the function
ogrisel Jul 30, 2020
507ad2c
DEBUG
ogrisel Jul 30, 2020
7622fec
DEBUG restore build_tools
ogrisel Jul 30, 2020
ab6a120
Try to re-enable conftest.py
ogrisel Jul 30, 2020
a1e855b
parent conftest.py is already use for some reason
ogrisel Jul 30, 2020
19311cf
Fix XFAIL marker for GradientBoostingClassifier docstring
ogrisel Jul 30, 2020
182d1a1
debugging conftest.py XFAIL
ogrisel Jul 30, 2020
27e7124
Fix platform name in conftest.py
ogrisel Jul 30, 2020
3777bb6
Restore azure pipelines and circle CI config
ogrisel Jul 30, 2020
caab75d
Add tag to trigger arm64 build
ogrisel Jul 30, 2020
8efdd4e
More cleanup of debug stuff
ogrisel Jul 30, 2020
918db5c
Fix circle ci
ogrisel Jul 30, 2020
cbee178
Trigger [arm64]
ogrisel Jul 30, 2020
8000 3df4c4a
Test doc in parallel on travis [scipy-dev] [arm64] [icc-build]
ogrisel Jul 30, 2020
846fa3a
Trigger [arm64] CI
ogrisel Jul 30, 2020
242527f
Update CI commit message markers
ogrisel Jul 31, 2020
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
20 changes: 15 additions & 5 deletions .travis.yml
Original file line number Diff line number Diff line change
Expand Up @@ -23,21 +23,31 @@ matrix:
# installed from their CI wheels in a virtualenv with the Python
# interpreter provided by travis.
- python: 3.7
env: CHECK_WARNINGS="true"
env:
- CHECK_WARNINGS="true"
- CI_CPU_COUNT="3"
if: type = cron OR commit_message =~ /\[scipy-dev\]/

# As above but build scikit-learn with Intel C compiler (ICC).
- python: 3.7
env:
- CHECK_WARNING="true"
- BUILD_WITH_ICC="true"
- CI_CPU_COUNT="3"
if: type = cron OR commit_message =~ /\[icc-build\]/

- python: 3.7
env:
- CI_CPU_COUNT="8"
os: linux
arch: arm64
if: type = cron OR commit_message =~ /\[arm64\]/

install: source build_tools/travis/install.sh
script:
- bash build_tools/travis/test_script.sh
- bash build_tools/travis/test_docs.sh
- bash build_tools/travis/test_pytest_soft_dependency.sh
- bash build_tools/travis/test_script.sh || travis_terminate 1
- bash build_tools/travis/test_docs.sh || travis_terminate 1
- bash build_tools/travis/test_pytest_soft_dependency.sh || travis_terminate 1
after_success: source build_tools/travis/after_success.sh
notifications:
webhooks:
Expand Down
46 changes: 32 additions & 14 deletions build_tools/travis/install.sh
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,12 @@
set -e

# Fail fast
echo "CPU Arch: ${TRAVIS_CPU_ARCH}"

# jq is used in travis_fastfail.sh, it's already pre-installed in non arm64
# environments
sudo apt-get install jq

build_tools/travis/travis_fastfail.sh

# Imports get_dep
Expand All @@ -35,28 +41,40 @@ ccache --max-size 100M --show-stats
# If Travvis has language=generic, deactivate does not exist. `|| :` will pass.
deactivate || :


# Install miniconda
fname=Miniconda3-latest-Linux-x86_64.sh
wget https://repo.continuum.io/miniconda/$fname -O miniconda.sh
if [[ "$TRAVIS_CPU_ARCH" == "arm64" ]]; then
wget https://github.com/conda-forge/miniforge/releases/latest/download/Miniforge3-Linux-aarch64.sh -O miniconda.sh
else
fname=Miniconda3-latest-Linux-x86_64.sh
wget https://repo.continuum.io/miniconda/$fname -O miniconda.sh
fi
MINICONDA_PATH=$HOME/miniconda
chmod +x miniconda.sh && ./miniconda.sh -b -p $MINICONDA_PATH
export PATH=$MINICONDA_PATH/bin:$PATH
conda update --yes conda

# Create environment and install dependencies
conda create -n testenv --yes python=3.7

source activate testenv

pip install --upgrade pip setuptools
echo "Installing numpy and scipy master wheels"
dev_anaconda_url=https://pypi.anaconda.org/scipy-wheels-nightly/simple
pip install --pre --upgrade --timeout=60 --extra-index $dev_anaconda_url numpy scipy pandas
pip install --pre cython
echo "Installing joblib master"
pip install https://github.com/joblib/joblib/archive/master.zip
echo "Installing pillow master"
pip install https://github.com/python-pillow/Pillow/archive/master.zip
pip install $(get_dep pytest $PYTEST_VERSION) pytest-cov
if [[ "$TRAVIS_CPU_ARCH" == "amd64" ]]; then
pip install --upgrade pip setuptools
echo "Installing numpy and scipy master wheels"
dev_anaconda_url=https://pypi.anaconda.org/scipy-wheels-nightly/simple
pip install --pre --upgrade --timeout=60 --extra-index $dev_anaconda_url numpy scipy pandas
pip install --pre cython
echo "Installing joblib master"
pip install https://github.com/joblib/joblib/archive/master.zip
echo "Installing pillow master"
pip install https://github.com/python-pillow/Pillow/archive/master.zip
else
conda install -y scipy numpy pandas cython
pip install joblib threadpoolctl
fi

pip install $(get_dep pytest $PYTEST_VERSION) pytest-cov pytest-xdist

# Build scikit-learn in the install.sh script to collapse the verbose
# build output in the travis output when it succeeds.
Expand All @@ -76,11 +94,11 @@ if [[ "$BUILD_WITH_ICC" == "true" ]]; then
# The build_clib command is implicitly used to build libsvm-skl. To compile
# with a different compiler we also need to specify the compiler for this
# command.
python setup.py build_ext --compiler=intelem -i -j 3 build_clib --compiler=intelem
python setup.py build_ext --compiler=intelem -i -j "${CI_CPU_COUNT}" build_clib --compiler=intelem
else
# Use setup.py instead of `pip install -e .` to be able to pass the -j flag
# to speed-up the building multicore CI machines.
python setup.py build_ext --inplace -j 3
python setup.py build_ext --inplace -j "${CI_CPU_COUNT}"
fi

python setup.py develop
Expand Down
2 changes: 1 addition & 1 deletion build_tools/travis/test_docs.sh
Original file line number Diff line number Diff line change
Expand Up @@ -9,4 +9,4 @@ if [[ "$BUILD_WITH_ICC" == "true" ]]; then
source /opt/intel/inteloneapi/setvars.sh
fi

make test-doc
PYTEST="pytest -n $CI_CPU_COUNT" make test-doc
17 changes: 13 additions & 4 deletions build_tools/travis/test_script.sh
6D40
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,8 @@ try:
except ImportError:
pass
"
python -c "import multiprocessing as mp; print('%d CPUs' % mp.cpu_count())"
python -c "import joblib; print(joblib.cpu_count(), 'CPUs')"
python -c "import platform; print(platform.machine())"

if [[ "$BUILD_WITH_ICC" == "true" ]]; then
# the tools in the oneAPI toolkits are configured via environment variables
Expand All @@ -36,9 +37,17 @@ run_tests() {
cp setup.cfg $TEST_DIR
cd $TEST_DIR

# Tests that require large downloads over the networks are skipped in CI.
# Here we make sure, that they are still run on a regular basis.
export SKLEARN_SKIP_NETWORK_TESTS=0
if [[ "$TRAVIS_CPU_ARCH" == "arm64" ]]; then
# use pytest-xdist for faster tests
TEST_CMD="$TEST_CMD -n $CI_CPU_COUNT"
else
# Tests that require large downloads over the networks are skipped in CI.
# Here we make sure, that they are still run on a regular basis.
#
# Note that using pytest-xdist is currently not compatible
# with fetching datasets in tests due to datasets cache corruptions issues.
export SKLEARN_SKIP_NETWORK_TESTS=0
fi

if [[ "$COVERAGE" == "true" ]]; then
TEST_CMD="$TEST_CMD --cov sklearn"
Expand Down
28 changes: 19 additions & 9 deletions conftest.py
Original file line number Diff line number Diff line change
Expand Up @@ -29,15 +29,25 @@ def pytest_addoption(parser):


def pytest_collection_modifyitems(config, items):

# FeatureHasher is not compatible with PyPy
if platform.python_implementation() == 'PyPy':
skip_marker = pytest.mark.skip(
reason='FeatureHasher is not compatible with PyPy')
for item in items:
if item.name.endswith(('_hash.FeatureHasher',
'text.HashingVectorizer')):
item.add_marker(skip_marker)
for item in items:
# FeatureHasher is not compatible with PyPy
if (item.name.endswith(('_hash.FeatureHasher',
'text.HashingVectorizer'))
and platform.python_implementation() == 'PyPy'):
marker = pytest.mark.skip(
reason='FeatureHasher is not compatible with PyPy')
item.add_marker(marker)
# Known failure on with GradientBoostingClassifier on ARM64
elif (item.name.endswith('GradientBoostingClassifier')
and platform.machine() == 'aarch64'):
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm still confused on the "aarch64" vs "ARM64" since the latter can also happen apparently https://github.com/python/cpython/blob/0c4f0f3b29d84063700217dcf90ad6860ed71c70/Lib/test/test_regrtest.py#L662

Anyway the the issue where this doctest failure was originally reported #17797 was also aarch64 so it's probably fine to merge as is


marker = pytest.mark.xfail(
reason=(
'know failure. See '
'https://github.com/scikit-learn/scikit-learn/issues/17797' # noqa
)
)
item.add_marker(marker)

# Skip tests which require internet if the flag is provided
if config.getoption("--skip-network"):
Expand Down
3 changes: 2 additions & 1 deletion doc/developers/contributing.rst
Original file line number Diff line number Diff line change
Expand Up @@ -478,9 +478,10 @@ message, the following actions are taken.
====================== ===================
Commit Message Marker Action Taken by CI
---------------------- -------------------
[scipy-dev] Add a Travis build with our dependencies (numpy, scipy, etc ...) development builds
[ci skip] CI is skipped completely
[lint skip] Azure pipeline skips linting
[scipy-dev] Add a Travis build with our dependencies (numpy, scipy, etc ...) development builds
[arm64] Add a Travis build for the ARM64 / aarch64 little endian architecture
[doc skip] Docs are not built
[doc quick] Docs built, but excludes example gallery plots
[doc build] Docs built including example gallery plots
Expand Down
2 changes: 1 addition & 1 deletion sklearn/cluster/tests/test_k_means.py
Original file line number Diff line number Diff line change
Expand Up @@ -333,7 +333,7 @@ def test_k_means_fit_predict(algo, dtype, constructor, seed, max_iter, tol):
# using more than one thread, the absolute values of the labels can be
# different between the 2 strategies but they should correspond to the same
# clustering.
assert v_measure_score(labels_1, labels_2) == 1
assert v_measure_score(labels_1, labels_2) == pytest.approx(1, abs=1e-15)


def test_minibatch_kmeans_verbose():
Expand Down
6 changes: 3 additions & 3 deletions sklearn/neural_network/tests/test_mlp.py
Original file line number Diff line number Diff line change
Expand Up @@ -488,7 +488,7 @@ def test_predict_proba_binary():

assert y_proba.shape == (n_samples, n_classes)
assert_array_equal(proba_max, proba_log_max)
assert_array_equal(y_log_proba, np.log(y_proba))
assert_allclose(y_log_proba, np.log(y_proba))

assert roc_auc_score(y, y_proba[:, 1]) == 1.0

Expand All @@ -511,7 +511,7 @@ def test_predict_proba_multiclass():

assert y_proba.shape == (n_samples, n_classes)
assert_array_equal(proba_max, proba_log_max)
assert_array_equal(y_log_proba, np.log(y_proba))
assert_allclose(y_log_proba, np.log(y_proba))


def test_predict_proba_multilabel():
Expand All @@ -535,7 +535,7 @@ def test_predict_proba_multilabel():

assert (y_proba.sum(1) - 1).dot(y_proba.sum(1) - 1) > 1e-10
assert_array_equal(proba_max, proba_log_max)
assert_array_equal(y_log_proba, np.log(y_proba))
assert_allclose(y_log_proba, np.log(y_proba))


def test_shuffle():
Expand Down
0