-
-
Notifications
You must be signed in to change notification settings - Fork 5.4k
Description
Now that the distutils
deprecation in Python 3.10 is around the corner, I've been thinking about moving build systems. This feels like the right time. The thought of doing a lot of work migrating numpy.distutils
features to setuptools
, basically becoming responsible for Fortran support there, and then still being stuck with such a poor build system isn't giving me warm and fuzzy feelings. So here's an alternative.
tl;dr there are only two candidates for use as a build system, Meson and CMake. Meson + mesonpep517
has more gaps in the short term than CMake + scikit-build
, however it's much cleaner (small code base of modern pure Python) than CMake (a ton of C++ code + a weird DSL + scikit-build seems to use legacy CMake constructs which are awful) and has much better documentation. So I'd prefer Meson.
What we need from a build system
Let's first outline everything that we need in terms of build, packaging, dev workflows, etc. And then figure out the projects that implement that.
At the highest level we need the following:
-
A development build (can be in-place or out-of-place, as long as the workflow is good)
-
create an
sdist
-
create packages from an
sdist
(create packages from the git repo is optional):- wheels
- conda packages
.deb
,.rpm
, Homebrew bottles, etc.
-
Other tools and jobs to invoke (standalone and/or via a
runtests.py
-like interface):- A documentation build
- Run tests
- Run benchmarks
- Measure code coverage
- Run linters and checkers (pyflakes, mypy, autopep8, etc.)
-
Interfacing with Python packaging/install tools (e.g.,
pip install .
should work as expected)
The build system itself should handle (note, some of these we don't have today but can have):
- languages: C, C++, Fortran, Cython
- compiler support including more niche compilers (e.g., clang-cl, ifort, mingw-w64, xlc)
- platform support: Windows, Linux, macOS, aarch32/64, AIX, ppc64le, niche Debian architectures
- support for multiple Python implementations (at least CPython, PyPy)
- handle code generation, templating and ahead-of-time compilation with Pythran
- parallel builds
- fast builds with caching (e.g.,
ccache
), incremental builds including for Cython - cross-compilation support
- good diagnostic output in build log and afterwards (e.g., build settings ending up in
__config__.py
) - debug builds
- coverage-enabled builds
- BLAS/LAPACK detection
- NumPy detection (for include dir)
- easy control of build flags, ideally not only via
CFLAGS
et al. but configurable per compiler - a way of handling optional dependencies, e.g. PyFFTW, OpenMP
- a way of special-casing certain situations (e.g., MSVC + gfortran on Windows)
- CPU feature detection (e.g., SIMD flags) - note: don't need it (yet) for SciPy, but need it for NumPy
Python-specific build features that are necessary but may live outside the main build system:
- Python extension naming support (e.g.,
submodulename.cpython-39m-x86_64-linux-gnu.so
) - byte-compiling
- handle vendoring of dependencies in wheels and name mangling
Moving to Meson
Advantages:
- Much faster builds. We now don't have parallel builds, setuptools doesn't do incremental builds, and setuptools is slow while Meson is as fast as it gets (it uses Ninja as backend). On a decent development machine we should be able to get to full rebuilds of ~1 min, and rebuilds much faster than that.
- Reliability: any system will have some bugs, however the combination of extensive monkeypatching and almost zero tests in the
distutils
,numpy.distutils
andsetuptools
combination is particularly fragile. - Support for cross-compiling. Right now we basically say "we don't know, setuptools doesn't support that - let us know if you get anywhere". It's not our own need, but there clearly is demand for it.
- Better build logs - clearer configuration and compiler/library detection info, as well as color-coded output which is easier to interpret.
- Less to maintain in the long term. If we can move SciPy - as by far the most Fortran-heavy Python library - we may just not add Fortran support to
setuptools
, which means that that headache just goes away. - Easier to debug build issues. Meson is much better code, both architecturally and code quality-wise, than
setuptools
.
Challenges:
-
It's a lot of work to move, and we may introduce new bugs in the process.
-
There are missing pieces of the puzzle:
- BLAS/LAPACK detection needs implementing on Meson.
- Meson has Cython support and test cases, but it's fairly minimal. Better support including the caching in SciPy's
tools/cythonize.py
should be useful. mesonpep517
builds sdists and wheels, but development builds are missing. Given that PEP 517 itself does not have support, it's unclear if this should be added tomesonpep517
or done as a separate package.
-
We'll be early adopters in the scientific Python space, so we may run into unforeseen issues.
A potential plan:
- Work in a fork, and start with sdist to wheel on one platform for a single SciPy submodule (delete other submodules one commit per submodule, so they're easy to add back).
- After 3-4 submodules we should have covered all important cases (Fortran, Cython, templating, codegen), so do only 3-4.
- Meson improvements for BLAS and Cython are probably needed for step (2) above, so make those when needed.
- Implement a development build - improve
runtests.py
as needed, and/or add tomesonpep517
or a new package. - Next add other platforms, first Windows, Linux and macOS. Then ping some packagers for, e.g., Debian and AIX to help check.
- If that all works well, add back all other submodules.
- Merge into SciPy
- Switch over a few CI jobs
- Update the
scipy-wheels
repo and test wheels for all platforms - Switch over all other CI jobs
- Delete the main
setup.py
, leave othersetup.py
files in place for a while (unused, just in case) - Do a release
- Delete all remnants of
distutils
-using code and declare victory
This will be a significant amount of work, which is why I added a GSoC project idea for it: https://github.com/scipy/scipy/wiki/GSoC-2021-project-ideas. A good student should get quite far in ~5 weeks of work.