Hypermodern Python Tooling (for - Claudio Jolowicz
Hypermodern Python Tooling (for - Claudio Jolowicz
Tooling
For anyone managing existing Python projects—individually
or within a company. Get to the forefront of Python’s
evolving ecosystem by using modern solutions and tools
that turbocharge productivity and ensure code quality.
Claudio Jolowicz
Hypermodern Python Tooling
by Claudio Jolowicz
See https://oreilly.com/catalog/errata.csp?isbn=9781098139582
for release details.
The views expressed in this work are those of the author, and
do not represent the publisher’s views. While the publisher and
the author have used good faith efforts to ensure that the
information and instructions contained in this work are
accurate, the publisher and the author disclaim all
responsibility for errors or omissions, including without
limitation responsibility for damages resulting from the use of
or reliance on this work. Use of the information and
instructions contained in this work is at your own risk. If any
code samples or other technology this work contains or
describes is subject to open source licenses or the intellectual
property rights of others, it is your responsibility to ensure that
your use thereof complies with such licenses and/or rights.
978-1-098-13958-2
[LSI]
Dedication
To Marianna
Preface
You don’t strictly need these tools to write Python software. Fire
up your system’s Python interpreter and get an interactive
prompt. Save your Python code as a script for later. Why use
anything beyond an editor and a shell?
This book will show you how developer tooling can help with
such challenges. The tools described here greatly benefit the
code quality, security, and maintainability of Python projects.
1
Laziness has been called “a programmer’s greatest virtue,” and
this saying applies to development tooling, too: keep your
workflow simple, and don’t adopt tools for their own sake. At
the same time, good programmers are also curious. Give the
tools in this book a try to see what value they may bring to your
projects.
Italic
Constant width
TIP
NOTE
WARNING
This book is here to help you get your job done. In general, if
example code is offered with this book, you may use it in your
programs and documentation. You do not need to contact us for
permission unless you’re reproducing a significant portion of
the code. For example, writing a program that uses several
chunks of code from this book does not require permission.
Selling or distributing examples from O’Reilly books does
require permission. Answering a question by citing this book
and quoting example code does not require permission.
Incorporating a significant amount of example code from this
book into your product’s documentation does require
permission.
If you feel your use of code examples falls outside fair use or
the permission given above, feel free to contact us at
permissions@oreilly.com.
O’Reilly Online Learning
NOTE
For more than 40 years, O’Reilly Media has provided technology and business
training, knowledge, and insight to help companies succeed.
How to Contact Us
Please address comments and questions concerning this book to
the publisher:
Sebastopol, CA 95472
800-889-8969 (in the United States or Canada)
707-829-0104 (fax)
support@oreilly.com
https://oreilly.com/about/contact.html
For news and information about our books and courses, visit
https://oreilly.com.
Acknowledgments
This book covers many open source Python projects. I am very
grateful to their authors and maintainers, most of whom work
on them in their free time, often over many years. In particular,
I would like to acknowledge the unsung heroes of the PyPA,
whose work on packaging standards lets the ecosystem evolve
toward better tooling. Special thanks to Thea Flowers for
writing Nox and building a welcoming community.
1
Larry Wall, Programming Perl (Sebastopol: O’Reilly, 1991).
2
The title of this book is inspired by Die hypermoderne Schachpartie (The
hypermodern chess game), written by Savielly Tartakower in 1924. It surveys the
revolution that was taking place in chess theory during his time.
Part I. Working with Python
Chapter 1. Installing Python
In this first chapter, I’ll show you how to install multiple Python
versions on some of the major operating systems in a
sustainable way—and how to keep your little snake farm in
good shape.
TIP
Even if you develop for only a single platform, I’d encourage you to learn about
working with Python on other operating systems. It’s fun—and familiarity with other
platforms enables you to provide a better experience to your software’s contributors
and users.
For these reasons, it’s common to support both current and past
versions of Python until their official end-of-life date and to set
up installations for them side by side on a developer machine.
With new feature versions coming out every year and support
extending over five years, this gives you a testing matrix of five
actively supported versions (see Figure 1-1). If that sounds like a
lot of work, don’t worry: the Python ecosystem comes with
tooling that makes this a breeze.
THE PYTHON RELEASE CYCLE
Feature versions are maintained for five years, after which they
reach end-of-life. Bugfix releases for a feature version occur
roughly every other month during the first 18 months after its
3
initial release. This is followed by security updates whenever
necessary during the remainder of the five-year support period.
Each maintenance release bumps the micro version.
export PATH="/usr/local/opt/python/bin:$PATH"
The previous line also works with Zsh, which is the default shell
on macOS. That said, there’s a more idiomatic way to
manipulate the search path on Zsh:
typeset -U path
path=(/usr/local/opt/python/bin $path)
fish_add_path /usr/local/opt/python/bin
Zsh .zshrc
fish .config/fish/fish.config
TIP
Unless your system already comes with a well-curated and up-to-date selection of
interpreters, prepend Python installations to the PATH environment variable, with
the latest stable version at the very front.
A SHORT HISTORY OF PATH
Depending on your domain and target environment, you may prefer to use the
Windows Subsystem for Linux (WSL) for Python development. In this case, please
refer to the section “Installing Python on Linux” instead.
> py
Python 3.12.2 (tags/v3.12.2:6abddd9, Feb 6 2024, 21:
Type "help", "copyright", "credits" or "license" for
>>>
> py -3.11
Python 3.11.8 (tags/v3.11.8:db85d51, Feb 6 2024, 22:
Type "help", "copyright", "credits" or "license" for
>>>
> py -V
Python 3.12.2
> py -3.11 -V
Python 3.11.8
NOTE
For historical reasons, py also inspects the first line of the script to see if a version is
specified there. The canonical form is #!/usr/bin/env python3 , which corresponds
to py -3 and works across all major platforms.
Restart the console for the setting to take effect. Don’t forget to
remove these variables once you upgrade from the prerelease
to the final release.
> py --list
-V:3.13 Python 3.13 (64-bit)
-V:3.12 * Python 3.12 (64-bit)
-V:3.11 Python 3.11 (64-bit)
-V:3.10 Python 3.10 (64-bit)
-V:3.9 Python 3.9 (64-bit)
-V:3.8 Python 3.8 (64-bit)
TIP
Even if you always use the Python Launcher yourself, you should still keep your
PATH up-to-date. Some third-party tools run the python.exe command directly—
you don’t want them to use an outdated Python version or fall back to the Microsoft
Store shim.
Homebrew Python
NOTE
Whenever you see names like python3.x or python@3.x in this section, replace
3.x with the actual feature version. For example, use python3.12 and
python@3.12 for Python 3.12.
You may find that you already have some Python versions
installed for other Homebrew packages that depend on them.
Nonetheless, it’s important that you install every version
explicitly. Automatically installed packages may get deleted
when you run brew autoremove to clean up resources.
Homebrew places a python3.x command for each version on
your PATH , as well as a python3 command for its main Python
package—which may be either the current or the previous
stable release. You should override this to ensure python3
points to the latest version. First, query the package manager
for the installation root (which is platform-dependent):
export PATH="/opt/homebrew/opt/python@3.12/bin:$PATH"
After installing a Python version, run the Install Certificates command located in the
/Applications/Python 3.x/ folder. This command installs Mozilla’s curated collection of
root certificates, which are required to establish secure internet connections from
Python.
/Library/Frameworks/Python.framework/Versions/3.x/
/Applications/Python 3.x/
FRAMEWORK BUILDS ON MACOS
Fedora Linux
Ubuntu Linux
You can now install Python versions using the APT package
manager:
TIP
Always remember to include the -full suffix when installing Python on Debian and
Ubuntu. The python3.x-full packages pull in the entire standard library and up-to-
date root certificates. In particular, they ensure that you can create virtual
environments.
Other Linux Distributions
You can install the Nix package manager using its official
installer. If you’re not ready to install Nix permanently, you can
get a taste of what’s possible using the Docker image for NixOS,
a Linux distribution built entirely using Nix:
$ docker run --rm -it nixos/nix
$ py -V
3.12.1
$ py -3.11 -V
3.11.7
$ py --list
3.13 │ /Library/Frameworks/Python.framework/Versions
3.12 │ /opt/homebrew/bin/python3.12
3.11 │ /opt/homebrew/bin/python3.11
3.10 │ /opt/homebrew/bin/python3.10
WARNING
For compatibility with the Windows version, the Python Launcher uses only the
Python version from shebangs, not the full interpreter path. As a result, you can end
up with a different interpreter than if you were to invoke the script directly without
py .
NOTE
In this section, you’ll use pyenv as a build tool. If you’re interested in using pyenv as
a version manager, please refer to the official documentation for additional setup
steps. I’ll discuss some of the trade-offs in “Managing Python Versions with pyenv”.
You can build and install any of these versions by passing them
to pyenv install :
export PATH="$HOME/.pyenv/versions/3.x.y/bin:$PATH"
$ export PYTHON_CONFIGURE_OPTS='--enable-optimiza
Before you can use this Python installation, you need to activate
the environment:
$ conda deactivate
An Overview of Installers
Figure 1-3 provides an overview of the main Python installation
methods for Windows, macOS, and Linux.
Summary
In this chapter, you’ve learned how to manage Python
installations on Windows, macOS, and Linux. Use the Python
Launcher to select interpreters installed on your system.
Additionally, audit your search path to ensure you have well-
defined python and python3 commands.
The next chapter zooms into a Python installation: its contents
and structure, and how your code interacts with it. You’ll also
learn about its lightweight cousins, virtual environments, and
the tooling that has evolved around those.
1
While CPython is the reference implementation of Python, there are quite a few
more to choose from: performance-oriented forks such as PyPy and Cinder,
reimplementations such as RustPython and MicroPython, and ports to other
platforms like WebAssembly, Java, and .NET.
2
At the time of writing in early 2024, the long-term support release of Debian Linux
ships patched versions of Python 2.7.16 and 3.7.3—both released half a decade ago.
(Debian’s “testing” distribution, which is widely used for development, comes with a
current version of Python.)
3
Starting with Python 3.13, bugfix releases are provided for two years after the initial
release.
4
Stack Overflow has a good step-by-step guide to building Windows installers.
5
“Virtual Environments” covers virtual environments in detail. For now, you can
think of a virtual environment as a shallow copy of a full Python installation that lets
you install a separate set of third-party packages.
6
Justin Mayer, “Homebrew Python Is Not For You”, February 3, 2021.
7
Do you have a Mac with Apple silicon, but programs that must run on Intel
processors? You’ll be pleased to know that the python.org installers also provide a
python3-intel64 binary using the x86_64 instruction set. You can run it on Apple
silicon thanks to Apple’s Rosetta translation environment.
8
The Unix command-line tools option places symbolic links in the /usr/local/bin
directory, which can conflict with Homebrew packages and other versions from
python.org. A symbolic link (symlink) is a special kind of file that points to another
file, much like a shortcut in Windows.
9
For historical reasons, framework builds use a different path for the per-user site
directory, the location where packages are installed if you invoke pip outside of a
virtual environment and without administrative privileges. This different installation
layout can prevent you from importing a previously installed package.
0
In a future release, Hatch will add interpreters to the Windows registry as well,
letting you use them with the Python Launcher.
Chapter 2. Python
Environments
NOTE
This book uses Python environment as an umbrella term that includes both system-
wide installations and virtual environments. Beware that some people only use the
term for project-specific environments, like virtual environments or Conda
environments.
Figure 2-1. Python environments consist of an interpreter and modules. Virtual
environments share the interpreter and the standard library with their parent
environment.
NOTE
This chapter uses the Python Launcher to invoke the interpreter (see “The Python
Launcher for Windows” and “The Python Launcher for Unix”). If you don’t have it
installed, replace py with python3 when running the examples.
$ py hello.py
$ py -m hello
$ hello
This method is convenient, but there’s also a drawback: if
you’ve installed the program in multiple environments, the first
environment on PATH “wins.” In such a scenario, the form py
-m hello offers you more control.
For this reason, the canonical way to install a package with pip
uses the -m form:
Windows %LocalAppData%\Programs\Python\Python3x
(single-user)
Windows %ProgramFiles%\Python3x
(multi-user)
macOS /opt/homebrew/Frameworks/Python.framework/V
(Homebrew)
macOS /Library/Frameworks/Python.framework/Versions
(python.org)
Linux /usr/local
(generic)
Linux /usr
(package
manager)
a
Homebrew on macOS Intel uses /usr/local instead of /opt/homebrew.
Linux and
Files Windows Notes
macOS
The interpreter
sys.version_info
sys.implementation.name
sys.implementation.version
sys.executable
sys.prefix
sys.path
The list of directories searched when importing Python
modules
Python modules
Modules are containers of Python objects that you load via the
import statement. They’re organized below Lib (Windows) or
lib/python3.x (Linux and macOS) with some platform-dependent
variations. Third-party packages go into a subdirectory named
site-packages.
Simple modules
Packages
Namespace packages
Extension modules
Built-in modules
Frozen modules
NOTE
The term package carries some ambiguity in the Python world. It refers both to
modules and to the artifacts used for distributing modules, also known as
distributions. Unless stated otherwise, this book uses package as a synonym for
distribution.
Bytecode is an intermediate representation of Python code that
is platform-independent and optimized for fast execution. The
interpreter compiles pure Python modules to bytecode when it
loads them for the first time. Bytecode modules are cached in
the environment in .pyc files under __pycache__ directories.
INSPECTING MODULES AND PACKAGES WITH IMPORTLIB
You can find out where a module comes from using importlib
from the standard library. Every module has an associated
ModuleSpec object whose origin attribute contains the
location of the source file or dynamic library for the module, or
a fixed string like "built-in" or "frozen" . The cached
attribute stores the location of the bytecode for a pure Python
module. Example 2-1 shows the origin of each module in the
standard library.
import importlib.util
import sys
distributions = importlib.metadata.distributions()
for distribution in sorted(distributions, key=lambda
print(f"{distribution.name:30} {distribution.vers
Entry-point scripts
This mechanism has two key benefits. First, you can launch the
application in a shell by running a simple command—say,
3
pydoc3 for Python’s built-in documentation browser. Second,
entry-point scripts use the interpreter and modules from their
environment, sparing you surprises about wrong Python
versions or missing third-party packages.
#!/usr/local/bin/python3.12
import pydoc
if __name__ == "__main__":
pydoc.cli()
NOTE
The #! line is known as a shebang on Unix-like operating systems. When you run the
script, the program loader uses the line to locate and launch the interpreter. The
program loader is the part of the operating system that loads a program into main
memory.
Other components
Shared libraries
Header files
Python installations contain header files for the Python/C
API, an application programming interface for writing
extension modules or embedding Python as a component
in a larger application. They’re located under Include
(Windows) or include/python3.x (Linux and macOS).
Static data
Tcl/Tk
a
Linux ~/.local/lib/python3.x/site-packages ~/.local/bin
a
Fedora places extension modules under lib64.
TIP
The per-user script directory may not be on PATH by default. If you install
applications into the per-user environment, remember to edit your shell profile to
update the search path. Pip issues a friendly reminder when it detects this situation.
Virtual Environments
a
Fedora places third-party extension modules under lib64 instead of lib.
Installing packages
$ py -m venv .venv
$ py -m pip install httpx
$ py
You can also create a virtual environment without pip using the
option --without-pip and install packages with an external
installer. If you have pip installed globally, you can pass the
target environment using its --python option, like this:
Activation scripts
You can provide a custom prompt using the option --prompt when creating the
environment. The special value . designates the current directory; it’s particularly
useful when you’re inside a project repository.
$ source .venv/bin/activate
> .venv\Scripts\activate
$ deactivate
NOTE
The name pyvenv.cfg is a remnant of the pyvenv script that used to ship with Python.
The py -m venv form makes it clearer which interpreter you use to create the
virtual environment—and thus which interpreter the environment itself will use.
pipx in a Nutshell
$ mkdir -p ~/.local/bin
$ export PATH="$HOME/.local/bin:$PATH"
Finally, you copy the entry-point script into the directory you
created in the first step—that would be a script named black
in the bin directory of the environment:
$ cp black/bin/black ~/.local/bin
$ black --version
black, 24.2.0 (compiled: yes)
Python (CPython) 3.12.2
On top of this simple idea, the pipx project has built a cross-
platform package manager for Python applications with a great
developer experience.
TIP
$ pipx ensurepath
$ pipx completions
With pipx installed on your system, you can use it to install and
manage applications from the Python Package Index (PyPI). For
example, here’s how you would install Black with pipx:
$ pipx upgrade-all
$ pipx reinstall-all
$ pipx uninstall-all
$ pipx list
TIP
Use pipx run <app> as the default method to install and run developer tools from
PyPI. Use pipx install <app> if you need more control over application
environments; for example, if you need to install plugins. (Replace <app> with the
name of the app.)
Configuring pipx
By default, pipx installs applications on the same Python
version that it runs on itself. This may not be the latest stable
version, particularly if you installed pipx using a system
package manager like APT. I recommend setting the
environment variable PIPX_DEFAULT_PYTHON to the latest
stable Python if that’s the case. Many developer tools you run
with pipx create their own virtual environments; for example,
virtualenv, Nox, tox, Poetry, and Hatch all do. It’s worthwhile to
ensure that all downstream environments use a recent Python
version by default:
You can use pip config to set the URL of your preferred
package index persistently:
$ pip config set global.index-url https://example
Alternatively, you can set the package index for the current
shell session only. Most pip options are also available as
environment variables:
$ export PIP_INDEX_URL=https://example.com
$ uv venv
NOTE
Despite its name, uv venv emulates the Python tool virtualenv, not the built-in venv
module. Virtualenv creates environments with any Python interpreter on your
system. It combines interpreter discovery with aggressive caching to make this fast
and flawless.
This section takes a deep dive into the other mechanism that
links programs to an environment: module import, which is the
process of locating and loading Python modules for a program.
TIP
In a nutshell, just like the shell searches PATH for executables, Python searches
sys.path for modules. This variable holds a list of locations from where Python can
load modules—most commonly, directories on the local filesystem.
Module Objects
When you import a module, the import system returns a
module object, an object of type types.ModuleType . Any global
variable defined by the imported module becomes an attribute
of the module object. This allows you to access the module
variable in dotted notation ( module.var ) from the importing
code.
exec(code, module.__dict__)
Most commonly, the __path__ attribute of a package contains a single entry: the
directory holding its __init__.py file. Namespace packages, on the other hand, can be
distributed across multiple directories.
When you first import a module, the import system stores the
module object in the sys.modules dictionary, using its fully
qualified name as a key. Subsequent imports return the module
object directly from sys.modules . This mechanism brings a
number of benefits:
Performance
Idempotency
Recursion
Module Specs
The module spec is the link between those two steps. A module
spec contains metadata about a module such as its name and
location, as well as an appropriate loader for the module
(Table 2-5). You can also access most of the metadata from the
module spec using special attributes directly on the module
object.
Module
Spec attribute Description
attribute
The import system finds and loads modules using two kinds of
objects. Finders ( importlib.abc.MetaPathFinder ) are
responsible for locating modules given their fully qualified
names. When successful, their find_spec method returns a
module spec with a loader; otherwise, it returns None . Loaders
( importlib.abc.Loader ) are objects with an exec_module
function that load and execute the module’s code. The function
takes a module object and uses it as a namespace when
executing the module. The finder and loader can be the same
object, which is then known as an importer.
NOTE
You can find the built-in logic for constructing sys.path in Modules/getpath.py in the
CPython source code. Despite appearances, this is not an ordinary module. When you
build Python, its code is frozen to bytecode and embedded in the executable.
Linux and
Windows Description
macOS
Site Packages
The site module adds the following path entries if they exist
on the filesystem:
Site packages
.pth files
'/home/user',
'/usr/local/lib/python312.zip',
'/usr/local/lib/python3.12',
'/usr/local/lib/python3.12/lib-dynload',
'/home/user/.local/lib/python3.12/site-packages',
'/usr/local/lib/python3.12/site-packages',
]
USER_BASE: '/home/user/.local' (exists)
USER_SITE: '/home/user/.local/lib/python3.12/site-pac
ENABLE_USER_SITE: True
If you’ve read this far, the module path may seem a little—
byzantine?
$ py -IPm site
sys.path = [
'/usr/local/lib/python312.zip',
'/usr/local/lib/python3.12',
'/usr/local/lib/python3.12/lib-dynload',
'/usr/local/lib/python3.12/site-packages',
]
USER_BASE: '/home/user/.local' (exists)
USER_SITE: '/home/user/.local/lib/python3.12/site-pac
ENABLE_USER_SITE: False
The current directory no longer appears on the module path,
and the per-user site packages are gone, too—even though the
directory exists on this system.
Summary
In this chapter, you’ve learned what Python environments are,
where to find them, and how they look on the inside. At the
core, a Python environment consists of the Python interpreter
and Python modules, as well as entry-point scripts to run
Python applications. Environments are tied to a specific version
of the Python language.
1
There’s also a pythonw.exe executable that runs programs without a console
window, like GUI applications.
2
A shared library is a file with executable code that multiple programs can use at
runtime. The operating system keeps only a single copy of the code in memory.
3
Windows installations don’t include an entry-point script for pydoc —launch it
using py -m pydoc instead.
4
Historically, macOS framework builds pioneered per-user installation before it
became a standard in 2008.
5
You could force the use of symbolic links on Windows via the --symlinks option—
but don’t. There are subtle differences in the way these work on Windows. For
example, the File Explorer resolves the symbolic link before it launches Python,
which prevents the interpreter from detecting the virtual environment.
6
Before Python 3.12, the venv module also pre-installed setuptools for the benefit
of legacy packages that don’t declare it as a build dependency.
7
Internally, pip queries the sysconfig module for an appropriate installation
scheme—a Python environment layout. This module constructs the installation
scheme using the build configuration of Python and the location of the interpreter in
the filesystem.
8
At the time of writing in 2024, pipx caches temporary environments for 14 days.
9
For modules located within a package, the __path__ attribute of the package takes
the place of sys.path .
Part II. Python Projects
Chapter 3. Python Packages
NOTE
Python folks use the word package for two distinct concepts. Import packages are
modules that contain other modules. Distribution packages are archive files for
distributing Python software—they’re the subject of this chapter.
In this chapter, I’ll explain how you can package your Python
projects and introduce you to tools that help with packaging
tasks. The chapter has three parts:
In the first part, I’ll talk about the life of a Python package. I’ll
also introduce an example application that you’ll use
throughout this book. And I’ll ask: why would you want to
package your code at all?
In the second part, I’ll introduce Python’s package
configuration file, pyproject.toml, and tools for working with
packages: build , hatchling , and Twine. The tools pip, uv,
and pipx also make a reappearance. Finally, I’ll introduce
Rye, a project manager that ties these packaging tools
together into a unified workflow. Along the way, you’ll learn
about build frontends and backends, wheels and sdists,
editable installs, and the src layout.
In the third part, I’ll look at project metadata in detail—the
various fields you can specify in pyproject.toml to define and
describe your package, and how to make efficient use of
them.
Figure 3-1. The package lifecycle: an author builds a project into a package and
uploads it to a package index, and then a user downloads and installs the package
into an environment.
Everything starts with a project: the source code of an
application, library, or other piece of software.
import json
import textwrap
import urllib.request
API_URL = "https://en.wikipedia.org/api/rest_v1/page/
def main():
with urllib.request.urlopen(API_URL) as response:
data = json.load(response)
print(data["title"], end="\n\n")
print(textwrap.fill(data["extract"]))
if __name__ == "__main__":
main()
The API_URL constant points to the REST API of the
English Wikipedia—or more specifically, its
/page/random/summary endpoint.
> py -m random_wikipedia_article
Jägersbleeker Teich
Why Packaging?
Sharing a script like Example 3-1 doesn’t require packaging. You
can publish it on a blog or a hosted repository, or send it to
friends by email or chat. Python’s ubiquity, the “batteries
included” approach of its standard library, and its nature as an
interpreted language make this possible.
Binary extensions
Metadata
[project]
name = "random-wikipedia-article"
version = "0.1"
[project.scripts]
random-wikipedia-article = "random_wikipedia_article:
[build-system]
requires = ["hatchling"]
q [ g ]
build-backend = "hatchling.build"
TIP
PyPI projects share a single namespace—their names aren’t scoped by the users or
organizations owning the projects. Choose a unique name such as random-
wikipedia-article-{your-name} , and rename the Python module accordingly.
[project]
[build-system]
[tool]
Lists are termed arrays in TOML and use the same notation as
Python:
[project]
name = "foo"
version = "0.1"
You can load a TOML file using the standard tomllib module:
import tomllib
{
"project": {
"name": "random-wikipedia-article",
"version": "0.1",
"scripts": {
"random-wikipedia-article": "random_wikipedia
}
},
"build-system": {
"requires": ["hatchling"],
"build-backend": "hatchling.build"
}
}
NOTE
A build frontend is an application that orchestrates the build process for a Python
package. Build frontends don’t know how to assemble packaging artifacts from
source trees. The tool that does the actual building is known as the build backend.
Open a terminal, change to the project directory, and invoke
build with pipx:
Figure 3-2 shows how the build frontend and the build backend
collaborate to build a package.
Figure 3-2. Build frontend and build backend
$ py -m venv buildenv
$ buildenv/bin/python -m pip install hatchling
$ buildenv/bin/python
>>> import hatchling.build as backend
>>> backend.get_requires_for_build_wheel()
[] # no additional build dependencies requested
>>> backend.build_wheel("dist")
'random_wikipedia_article-0.1-py2.py3-none-any.whl'
NOTE
Some build frontends let you build in your current environment. If you disable build
isolation, the frontend only checks for build dependencies. If it installed them, the
build and runtime dependencies of different packages might conflict.
Why separate the build frontend from the build backend? It
means that tools can trigger package builds without knowing
the intricacies of the build process. For example, package
installers like pip and uv build packages on the fly when you
install from a source directory (see “Installing Projects from
Source”).
a
Project requires build-backend
a
See the official documentation of each tool for any recommended version
bounds.
View at:
https://test.pypi.org/project/random-wikipedia-articl
$ random-wikipedia-article
You could build a wheel with build and install it into a virtual
environment:
$ uv venv
$ uv pip install .
$ pipx install .
installed package random-wikipedia-article 0.1, ins
These apps are now globally available
- random-wikipedia-article
Once you’ve installed your package in this way, you won’t need
to reinstall it to see changes to the source code—only when you
edit pyproject.toml to change the project metadata or add a
third-party dependency.
Project Layout
Dropping a pyproject.toml next to a single-file module is an
appealingly simple approach. Unfortunately, this project layout
comes with a serious footgun, as you’ll see in this section. Let’s
start by breaking something in the project:
def main():
raise Exception("Boom!")
$ py -m random_wikipedia_article
Cystiscus viaderi
main()
The answer has to do with the nature and history of the Python
project: Python is a decentralized open source project driven by
a community of thousands of volunteers, with a history
spanning more than three decades of organic growth. This
makes it hard for a single packaging tool to cater to all demands
3
and become firmly established.
Your first step with Rye is initializing a new project with rye
init . If you don’t pass the project name, Rye uses the name of
the current directory. Use the --script option to include an
entry-point script:
random-wikipedia-article
├── .git
├── .gitignore
├── .python-version
├── .venv
├── README.md
├── pyproject.toml
└── src
└── random_wikipedia_article
├── __init__.py
└── __main__.py
$ rye build
$ rye publish -r testpypi --repository-url https:
$ rye sync
random_wikipedia_article-0.1.tar.gz
random_wikipedia_article-0.1-py2.py3-none-any.whl
These artifacts are known as wheels and sdists. Wheels are ZIP
archives with a .whl extension, and they’re built distributions—
for the most part, installers extract them into the environment
as-is. Sdists, by contrast, are source distributions: they’re
compressed archives of the source code with packaging
metadata. Sdists require an additional build step to produce an
installable wheel.
TIP
The name “wheel” for a Python package is a reference to wheels of cheese. PyPI was
originally known as the Cheese Shop, after the Monty Python sketch about a cheese
shop with no cheese whatsoever. (These days, PyPI serves over a petabyte of
packages per day.)
The distinction between source distributions and built
distributions may seem strange for an interpreted language like
Python. But you can also write Python modules in a compiled
language. In this situation, source distributions provide a useful
fallback when no prebuilt wheels are available for a platform.
TIP
Python tag
ABI tag
Platform tag
cryptography-38.0.4-cp36-abi3-manylinux_2_28_x86_64.whl
NOTE
The core metadata standards predate pyproject.toml by many years. Most project
metadata fields correspond to a core metadata field, but their names and syntax
differ slightly. As a package author, you can safely ignore this translation and focus
on the project metadata.
Project Metadata
Build backends write out core metadata fields based on what
you specify in the project table of pyproject.toml. Table 3-3
provides an overview of all the fields you can use in the
project table.
Table 3-3. The project table
[project]
name = "random-wikipedia-article"
version = "0.1"
description = "Display extracts from random Wikipedia
keywords = ["wikipedia"]
readme = "README.md" # only if your project has a RE
license = { text = "MIT" }
authors = [{ name = "Your Name", email = "you@example
classifiers = ["Topic :: Games/Entertainment :: Fortu
urls = { Homepage = "https://yourname.dev/projects/ra
requires-python = ">=3.8"
dependencies = ["httpx>=0.27.0", "rich>=13.7.1"]
[project]
name = "random-wikipedia-article"
Your users specify this name to install the project with pip. This
field also determines your project’s URL on PyPI. You can use
any ASCII letter or digit to name your project, interspersed with
periods, underscores, and hyphens. Packaging tools normalize
project names for comparison: all letters are converted to
lowercase, and punctuation runs are replaced by a single
hyphen (or underscore, in the case of package filenames). For
example, Awesome.Package , awesome_package , and awesome-
package all refer to the same project.
Project names are distinct from import names, the names users
specify to import your code. Import names must be valid
Python identifiers, so they can’t have hyphens or periods and
can’t start with a digit. They’re case-sensitive and can contain
any Unicode letter or digit. As a rule of thumb, you should have
a single import package per distribution package and use the
same name for both (or a straightforward translation, like
random-wikipedia-article and random_wikipedia_article ).
Versioning Projects
[project]
version = "0.1"
But sometimes it’s useful to let the build backend fill in a field
dynamically. For example, “Single-Sourcing the Project Version”
shows how you can derive the package version from a Python
module or Git tag instead of duplicating it in pyproject.toml.
[project]
dynamic = ["version", "readme"]
SINGLE-SOURCING THE PROJECT VERSION
__version__ = "0.1"
[project]
name = "random-wikipedia-article"
dynamic = ["version"]
[tool.hatch.version]
path = "src/random_wikipedia_article/__init__.py"
[build-system]
requires = ["hatchling"]
build-backend = "hatchling.build"
Mark the version field as dynamic.
__version__ = version("random-wikipedia-article")
But don’t go and add this boilerplate to all your projects yet.
Reading the metadata from the environment isn’t something
you want to do during program startup. Third-party libraries
like click perform the metadata lookup on demand—for
example, when the user specifies a command-line option like -
-version . You can read the version on demand by providing a
5
__getattr__ function for your module (Example 3-7).
Example 3-7. Reading the version from the installed metadata on demand
def __getattr__(name):
if name != "__version__":
msg = f"module {__name__} has no attribute {n
raise AttributeError(msg)
return version("random-wikipedia-article")
Example 3-8. Deriving the project version from the version control system
[project]
name = "random-wikipedia-article"
dynamic = ["version"]
[tool.hatch.version]
source = "vcs"
[build-system]
requires = ["hatchling", "hatch-vcs"]
build-backend = "hatchling.build"
Entry-Point Scripts
[project.scripts]
random-wikipedia-article = "random_wikipedia_article:
$ random-wikipedia-article
[project.gui-scripts]
random-wikipedia-article-gui = "random_wikipedia_arti
Entry Points
[project.entry-points.some_application]
my-plugin = "my_plugin"
[project.entry-points.some_application]
my-plugin = "my_plugin.submodule:plugin"
[project]
dependencies = ["pytest"]
[project.entry-points.pytest11]
random-wikipedia-article = "random_wikipedia_article"
import json
i t llib t
import urllib.request
import pytest
API_URL = "https://en.wikipedia.org/api/rest_v1/page/
@pytest.fixture
def random_wikipedia_article():
with urllib.request.urlopen(API_URL) as response:
return json.load(response)
# test_wikipedia_viewer.py
def test_wikipedia_viewer(random_wikipedia_article):
print(random_wikipedia_article["extract"])
assert False
$ uv pip install .
$ py -m pytest
============================= test session starts ===
platform darwin -- Python 3.12.2, pytest-8.1.1, plugg
rootdir: ...
plugins: random-wikipedia-article-0.1
collected 1 item
test_wikipedia_viewer.py F
____________________________ test_wikipedia_viewer __
def test_wikipedia_viewer(random_wikipedia_articl
print(random_wikipedia_article["extract"])
> assert False
E assert False
test_wikipedia_viewer.py:4: AssertionError
----------------------------- Captured stdout call --
Halgerda stricklandi is a species of sea slug, a dori
marine gastropod mollusk in the family Discodorididae
=========================== short test summary info =
=========================== short test summary info =
FAILED test_wikipedia_viewer.py::test_wikipedia_viewe
============================== 1 failed in 1.10s ====
[project]
authors = [{ name = "Your Name", email = "you@example
maintainers = [
{ name = "Alice", email = "alice@example.com" },
{ name = "Bob", email = "bob@example.com" },
]
[project]
description = "Display extracts from random Wikipedia
[project]
readme = "README.md"
Instead of a string, you can also specify a table with file and
content-type keys:
[project]
readme = { file = "README", content-type = "text/plai
[project.readme]
content-type = "text/markdown"
text = """
# random-wikipedia-article
[project]
keywords = ["wikipedia"]
[project]
classifiers = [
"Development Status :: 3 - Alpha",
"Environment :: Console",
Classifier
Description Example
group
[project.urls]
Homepage = "https://yourname.dev/projects/random-wiki
Source = "https://github.com/yourname/random-wikipedi
Issues = "https://github.com/yourname/random-wikipedi
Documentation = "https://readthedocs.io/random-wikipe
The License
[project]
license = { text = "MIT" }
classifiers = ["License :: OSI Approved :: MIT Licens
I recommend using the text key with a SPDX license identifier
such as “MIT” or “Apache-2.0.” The Software Package Data
Exchange (SPDX) is an open standard backed by the Linux
Foundation for communicating software bill of material
information, including licenses.
NOTE
[project]
license = { text = "proprietary" }
classifiers = [
"License :: Other/Proprietary License",
"Private :: Do Not Upload",
]
The Required Python Version
[project]
requires-python = ">=3.8"
Tools like Nox and tox make it easy to run checks across
multiple Python versions, helping you ensure that the field
reflects reality. As a baseline, I recommend requiring the oldest
Python version that still receives security updates. You can find
the end-of-life dates for all current and past Python versions on
the Python Developer Guide.
WARNING
Don’t specify an upper bound for the required Python version unless you know that
your package is not compatible with any higher version. Upper bounds cause
disruption in the ecosystem when a new version is released.
Summary
Packaging allows you to publish releases of your Python
projects, using source distributions (sdists) and built
distributions (wheels). These artifacts contain your Python
modules, together with project metadata, in an archive format
that end users can easily install into their environments. The
standard pyproject.toml file defines the build system for a
Python project as well as the project metadata. Build frontends
like build , pip, and uv use the build system information to
install and run the build backend in an isolated environment.
The build backend assembles an sdist and wheel from the
source tree and embeds the project metadata. You can upload
packages to the Python Package Index (PyPI) or a private
repository, using a tool like Twine. The Python project manager
Rye provides a more integrated workflow on top of these tools.
1
Even the venerable Comprehensive Perl Archive Network (CPAN) didn’t exist in
February 1991, when Guido van Rossum published the first release of Python on
Usenet.
2
By default, the build tool builds the wheel from the sdist instead of the source tree
to ensure that the sdist is valid. Build backends can request additional build
dependencies using the get_requires_for_build_wheel and
get_requires_for_build_sdist build hooks.
3
Python’s packaging ecosystem is also a great demonstration of Conway’s law. In
1967, Melvin Conway—an American computer scientist also known for developing
the concept of coroutines—observed that organizations will design systems that are
copies of their communication structure.
4
This is especially true given the existence of typosquatting—where an attacker
uploads a malicious package whose name is similar to a popular package—and
dependency confusion attacks—where a malicious package on a public server uses
the same name as a package on a private company repository.
5
This nifty technique comes courtesy of my reviewer Hynek Schlawack.
6
In case you’re wondering, the +g6b80314 suffix is a local version identifier that
designates downstream changes, in this case using output from the command git
describe .
7
Test fixtures set up objects that you need to run repeatable tests against your code.
8
You can also add Trove classifiers for each supported Python version. Some
backends backfill classifiers for you—Poetry does this out of the box for Python
versions and project licenses.
Chapter 4. Dependency
Management
Example 4-1 shows how you can use httpx to send a request to
the Wikipedia API with the header. You could also use the
standard library to send a User-Agent header with your
requests. But httpx offers a more intuitive, explicit, and
flexible interface, even when you’re not using any of its
advanced features.
import textwrap
import httpx
API_URL = "https://en.wikipedia.org/api/rest_v1/page/
USER_AGENT = "random-wikipedia-article/0.1 (Contact:
def main():
headers = {"User-Agent": USER_AGENT}
print(data["title"], end="\n\n")
print(textwrap.fill(data["extract"]))
This line performs two HTTP GET requests to the API. The
first one goes to the random endpoint, which responds
with a redirect to the actual article. The second one
follows the redirect.
While you’re at it, let’s improve the look and feel of the
program. Example 4-2 uses Rich, a library for console output, to
display the article title in bold. That hardly scrapes the surface
of Rich’s formatting options. Modern terminals are surprisingly
capable, and Rich lets you leverage their potential with ease.
Take a look at its official documentation for details.
def main():
...
console = Console(width=72, highlight=False)
console.print(data["title"], style="bold", end="\
console.print(data["extract"])
The style keyword allows you to set the title apart using
a bold font.
[project]
name = "random-wikipedia-article"
version = "0.1"
dependencies = ["httpx", "rich"]
...
[project]
dependencies = ["httpx>=0.27.0", "rich>=13.7.1"]
[project]
dependencies = ["awesome>=1.2,!=1.3.1"]
[project]
dependencies = ["awesome>=1.2,<2"]
WARNING
Excluding versions after the fact has a pitfall that you need to be aware of.
Dependency resolvers can decide to downgrade your project to a version without the
exclusion and upgrade the dependency anyway. Lock files can help with this.
UPPER VERSION BOUNDS IN PYTHON
Extras
def main():
h d {" " }
headers = {"User-Agent": USER_AGENT}
with httpx.Client(headers=headers, http2=True) as
...
[project]
dependencies = ["httpx[http2]>=0.27.0", "rich>=13.7.1
Optional dependencies
Let’s look at this situation from the point of view of httpx . The
h2 and brotli dependencies are optional, so httpx declares
them under optional-dependencies instead of dependencies
(Example 4-3).
[project]
name = "httpx"
[project.optional-dependencies]
http2 = ["h2>=3,<5"]
brotli = ["brotli"]
try:
import h2
except ImportError:
h2 = None
Environment Markers
def build_user_agent():
fields = metadata("random-wikipedia-article")
return USER_AGENT.format_map(fields)
def main():
headers = {"User-Agent": build_user_agent()}
...
Environment Standard
Description Examples
marker library
a
The python_version and implementation_version markers apply
transformations. See PEP 508 for details.
[project]
requires-python = ">=3.7"
dependencies = [
"httpx[http2]>=0.24.1",
"rich>=13.7.1",
"importlib-metadata>=6.7.0; python_version < '3.8
]
[project]
dependencies = ["""
awesome-package; python full version <= '3.8.1'
awesome package; python_full_version <= 3.8.1
and (implementation_name == 'cpython' or implemen
and sys_platform == 'darwin'
and 'arm' in platform_version
"""]
Development Dependencies
Development dependencies are third-party packages that you
require during development. As a developer, you might use the
pytest testing framework to run the test suite for your project,
the Sphinx documentation system to build its docs, or a number
of other tools to help with project maintenance. Your users, on
the other hand, don’t need to install any of these packages to
run your code.
def test_build_user_agent():
assert "random-wikipedia-article" in build_user_a
Let’s run the test with pytest. I’m assuming you already have an
active virtual environment with an editable install of your
project. Enter the commands below to install and run pytest in
that environment:
$ uv pip install pytest
$ py -m pytest
========================= test session starts =======
platform darwin -- Python 3.12.2, pytest-8.1.1, plugg
rootdir: ...
plugins: anyio-4.3.0
collected 1 item
tests/test_random_wikipedia_article.py .
For now, things look great. Tests help your project evolve
without breaking things. The test for build_user_agent is a
first step in that direction. Installing and running pytest is a
small infrastructure cost compared to these long-term benefits.
Optional Dependencies
[project.optional-dependencies]
tests = ["pytest>=7.4.4", "pytest-sugar>=1.0.0"]
docs = ["sphinx>=5.3.0"]
You can now install the test dependencies using the tests
extra:
You can also define a dev extra with all the development
dependencies. This lets you set up a development environment
in one go, with your project and every tool it uses:
$ uv pip install -e ".[dev]"
[project.optional-dependencies]
tests = ["pytest>=7.4.4", "pytest-sugar>=1.0.0"]
docs = ["sphinx>=5.3.0"]
dev = ["random-wikipedia-article[tests,docs]"]
Requirements Files
pytest>=7.4.4
pytest-sugar>=1.0.0
sphinx>=5.3.0
# requirements/docs.txt
sphinx>=5.3.0
# requirements/dev.txt
-r tests.txt
-r docs.txt
If you include other requirements files using -r , their paths are evaluated relative
to the including file. By contrast, paths to dependencies are evaluated relative to your
current directory, which is typically the project directory.
Locking Dependencies
You’ve installed your dependencies in a local environment or in
continuous integration (CI), and you’ve run your test suite and
any other checks you have in place. Everything looks good, and
you’re ready to deploy your code. But how do you install the
same packages in production that you used when you ran your
checks?
WARNING
Supply chain attacks infiltrate a system by targeting its third-party dependencies. For
example, in 2022, a threat actor dubbed “JuiceLedger” uploaded malicious packages
6
to legitimate PyPI projects after compromising them with a phishing campaign.
[project]
dependencies = ["httpx[http2]==0.27.0", "rich==13.7.1
$ uv pip install .
$ uv pip freeze
anyio==4.3.0
certifi==2024.2.2
h11==0.14.0
h2==4.1.0
hpack==4.0.0
httpcore==1.0.4
httpx==0.27.0
hyperframe==6.0.1
idna==3.6
markdown-it-py==3.0.0
mdurl==0.1.2
pygments==2.17.2
random-wikipedia-article @ file:///Users/user/random-
rich==13.7.1
rich 13.7.1
sniffio==1.3.1
httpx==0.27.0 \
--hash=sha256:71d5465162c13681bff01ad59b2cc68dd838ea1
--hash=sha256:a0cb88a46f32dc874e04ee956e4c2764aba2aa2
Hashes also have the side effect that pip refuses to install
packages without them: either all packages have hashes, or
none do. As a consequence, hashes protect you from installing
files that aren’t listed in the requirements file.
$ py -m venv .venv
$ pipx run --spec=pip-tools pip-sync --python-exe
p p p p p p p y py
The command removes the project itself since it’s not listed in
the requirements file. Reinstall it after synchronizing:
# requirements/tests.in
pytest>=7.4.4
pytest-sugar>=1.0.0
# requirements/docs.in
sphinx>=5.3.0
# requirements/dev.in
-r tests.in
-r docs.in
Option Description
Summary
In this chapter, you’ve learned how to declare project
dependencies using pyproject.toml and how to declare
development dependencies using either extras or requirements
files. You’ve also learned how to lock dependencies for reliable
deployments and reproducible checks using pip-tools and uv. In
the next chapter, you’ll see how the project manager Poetry
helps with dependency management using dependency groups
and lock files.
1
In a wider sense, the dependencies of a project consist of all software packages that
users require to run its code—including the interpreter, the standard library, third-
party packages, and system libraries. Conda and distro package managers like APT,
DNF, and Homebrew support this generalized notion of dependencies.
2
Henry Schreiner, “Should You Use Upper Bound Version Constraints?”, December 9,
2021.
3
For simplicity, the code doesn’t handle multiple authors—which one ends up in the
header is undefined.
4
Robert Collins, “PEP 508 – Dependency specification for Python Software Packages”,
November 11, 2015.
5
Stephen Rosen, “PEP 735 – Dependency Groups in pyproject.toml”, November 20,
2023.
6
Dan Goodin, “Actors Behind PyPI Supply Chain Attack Have Been Active Since Late
2021”, September 2, 2022.
7
Natalie Weizenbaum, “PubGrub: Next-Generation Version Solving”, April 2, 2018.
8
Brett Cannon, “Lock Files, Again (But This Time w/ Sdists!)”, February 22, 2024.
9
Uninstalling the package isn’t enough: the installation can have side effects on your
dependency tree. For example, it may upgrade or downgrade other packages or pull
in additional dependencies.
Chapter 5. Managing
Projects with Poetry
Installing Poetry
Install Poetry globally using pipx to keep its dependencies
isolated from the rest of the system:
You can omit the --python option if pipx already uses the new
Python version (see “Configuring pipx”).
NOTE
Poetry also comes with an official installer, which you can download and run with
Python. It’s not as flexible as pipx, but it provides a readily available alternative:
$ poetry
Creating a Project
You can create a new project using the command poetry new .
As an example, I’ll use the random-wikipedia-article project
from previous chapters. Run the following command in the
parent directory where you want to keep your new project:
random-wikipedia-article
├── README.md
├── pyproject.toml
├── src
│ └── random_wikipedia_article
│ └── __init__.py
└── tests
└── __init__.py
[tool.poetry]
[ p y]
name = "random-wikipedia-article"
version = "0.1.0"
description = ""
authors = ["Your Name <you@example.com>"]
readme = "README.md"
packages = [{include = "random_wikipedia_article", fr
[tool.poetry.dependencies]
python = "^3.12"
[build-system]
requires = ["poetry-core"]
build-backend = "poetry.core.masonry.api"
[tool.poetry]
name = "random-wikipedia-article"
version = "0.1.0"
description = "Display extracts from random Wikipedia
keywords = ["wikipedia"]
license = "MIT"
classifiers = [
"License :: OSI Approved :: MIT License",
"Development Status :: 3 - Alpha",
"Environment :: Console",
"Topic :: Games/Entertainment :: Fortune Cookies"
]
authors = ["Your Name <you@example.com>"]
readme = "README.md"
homepage = "https://yourname.dev/projects/random-wiki
repository = "https://github.com/yourname/random-wiki
documentation = "https://readthedocs.io/random-wikipe
packages = [{include = "random_wikipedia_article", fr
[tool.poetry.dependencies]
python = ">=3.10"
[tool.poetry.urls]
Issues = "https://github.com/yourname/random-wikipedi
[tool.poetry.scripts]
random-wikipedia-article = "random_wikipedia_article:
The license field is a string with a SPDX identifier, not a
table.
The readme field is a string with the file path. You can
also specify multiple files as an array of strings, such as
README.md and CHANGELOG.md. Poetry concatenates
them with a blank line in between.
project
Field Type Description
field
$ poetry check
All set!
The include and exclude fields allow you to list other files to
include in, or exclude from, the distribution. Poetry seeds the
exclude field using the .gitignore file, if present. By default,
Poetry includes these additional files in source distributions
only. Instead of a string, you can use a table with path and
format keys to specify the distribution formats that should
include the files. Example 5-3 shows how to include the test
suite in source distributions.
Copy the contents of Example 5-4 into the __init__.py file in the
new project.
import httpx
from rich.console import Console
API_URL = "https://en.wikipedia.org/api/rest_v1/page/
USER_AGENT = "{Name}/{Version} (Contact: {Author-emai
def main():
()
fields = metadata("random-wikipedia-article")
headers = {"User-Agent": USER_AGENT.format_map(fi
Managing Dependencies
Let’s add the dependencies for random-wikipedia-article ,
starting with Rich, the console output library:
[tool.poetry.dependencies]
python = ">=3.10"
rich = "^13.7.1"
Caret Constraints
rich = "^13.7.1"
rich = ">=13.7.1,<14"
[tool.poetry.dependencies]
python = ">=3.10"
rich = ">=13.7.1"
httpx = {version = ">=0.27.0", extras = ["http2"]}
[tool.poetry.dependencies]
awesome = {version = ">=1", markers = "implementation
[[package]]
name = "rich"
version = "13.7.1"
python-versions = ">=3.7.0"
dependencies = {markdown-it-py = ">=2.2.0", pygments
files = [
{file = "rich-13.7.1-py3-none-any.whl", hash = "s
{file = "rich-13.7.1.tar.gz", hash = "sha256:9be3
]
$ poetry show
markdown-it-py 3.0.0 Python port of markdown-it. Mar
mdurl 0.1.2 Markdown URL utilities
pygments 2.17.2 Pygments is a syntax highlighti
rich 13.7.1 Render rich text, tables, progr
Updating Dependencies
You can update all dependencies in the lock file to their latest
versions using a single command:
$ poetry update
Managing Environments
Poetry’s add , update , and remove commands don’t just
update dependencies in the pyproject.toml and poetry.lock files.
They also synchronize the project environment with the lock
file by installing, updating, or removing packages. Poetry
creates the virtual environment for the project on demand.
Before you use the environment, you should install the project.
Poetry performs editable installs, so the environment reflects
any code changes immediately:
$ poetry install
$ poetry shell
(random-wikipedia-article-py3.12) $ random-wikipedia
(random-wikipedia-article-py3.12) $ exit
You can also run the application in your current shell session,
using the command poetry run :
TIP
Just type py to get a Python session for your Poetry project on Linux and macOS.
This requires the Python Launcher for Unix, and you must configure Poetry to use in-
project environments.
Dependency Groups
Poetry allows you to declare development dependencies,
organized in dependency groups. Dependency groups aren’t
part of the project metadata and are invisible to end users. Let’s
add the dependency groups from “Development Dependencies”:
[tool.poetry.group.tests.dependencies]
pytest = "^8.1.1"
pytest-sugar = "^1.0.0"
[tool.poetry.group.docs.dependencies]
sphinx = "^7.2.6"
[tool.poetry.group.docs.dependencies]
sphinx = "^7.2.6"
WARNING
Don’t specify the --optional flag when you add a dependency group with poetry
add —it doesn’t mark the group as optional. The option designates optional
dependencies that are behind an extra; it has no valid use in the context of
dependency groups.
Option Description
NOTE
If you’re following along in this section, please don’t upload the example project to
PyPI. Use the TestPyPI repository instead—it’s a playground for testing, learning, and
experimentation.
$ poetry build
Building random-wikipedia-article (0.1.0)
- Building sdist
- Built random_wikipedia_article-0.1.0.tar.gz
- Building wheel
- Built random_wikipedia_article-0.1.0-py3-none-any
$ poetry publish
$ export POETRY_REPOSITORIES_<REPO>_URL=<url>
$ export POETRY_PYPI_TOKEN_<REPO>=<token>
$ export POETRY_HTTP_BASIC_<REPO>_USERNAME=<usern
$ export POETRY_HTTP_BASIC_<REPO>_PASSWORD=<passw
Specify the source when adding packages from supplemental sources. Otherwise,
Poetry searches all sources when looking up a package. An attacker could upload a
malicious package to PyPI with the same name as your internal package (dependency
confusion attack).
If the plugin affects the build stage of your project, add it to the
build dependencies in pyproject.toml, as well. See “The Dynamic
Versioning Plugin” for an example.
poetry-plugin-export
poetry-plugin-bundle
poetry-dynamic-versioning
Distribute the requirements file to the target system and use pip
to install the dependencies (typically followed by installing a
wheel of your project):
The bundle plugin allows you to deploy your project and locked
dependencies to a virtual environment of your choosing. It
creates the environment, installs the dependencies from the
lock file, then builds and installs a wheel of your project.
$ app/bin/random-wikipedia-article
In Example 5-7, the first stage installs Poetry and the bundle
plugin, copies the Poetry project, and bundles it into a self-
contained virtual environment. The second stage copies the
virtual environment into a minimal Python image.
FROM gcr.io/distroless/python3-debian12
COPY --from=builder /venv /venv
ENTRYPOINT ["/venv/bin/random-wikipedia-article"]
If you have Docker installed, you can try this out. First, create a
Dockerfile in your project with the contents from Example 5-7.
Next, build and run the Docker image:
Install the plugin with pipx and enable it for your project:
[tool.poetry-dynamic-versioning]
enable = true
Build frontends like pip and build need the plugin when they
build your project. For this reason, enabling the plugin also
adds it to the build dependencies in pyproject.toml. The plugin
brings its own build backend, which wraps the one provided by
Poetry:
[build-system]
requires = ["poetry-core>=1.0.0", "poetry-dynamic-ver
build-backend = "poetry_dynamic_versioning.backend"
Poetry still requires the version field in its own section. Set the
field to "0.0.0" to indicate that it’s unused:
[tool.poetry]
version = "0.0.0"
You can now add a Git tag to set your project version:
[tool.poetry-dynamic-versioning]
enable = true
substitution.folders = [{path = "src"}]
i
import argparse
__version__ = "0.0.0"
def main():
parser = argparse.ArgumentParser(prog="random-wik
parser.add_argument(
"--version", action="version", version=f"%(pr
)
parser.parse_args()
...
$ uv venv
$ uv pip install .
$ py -m random_wikipedia_article --version
random-wikipedia-article 1.0.0.post1.dev0+51c266e
1
Sébastien Eustace, “Support for PEP 621”, November 6, 2020.
2
The command also keeps your lock file and project environment up-to-date. If you
edit the constraint in pyproject.toml, you’ll need to do this yourself. Read on to learn
more about lock files and environments.
3
Apart from Poetry’s own poetry.lock and the closely related PDM lock file format,
there’s pipenv’s Pipfile.lock and the conda-lock format for Conda environments.
4
Replace bin with Scripts if you’re on Windows.
Part III. Testing and Static Analysis
Chapter 6. Testing with
pytest
If you think back to when you wrote your first programs, you
may recall a common experience: you had an idea for how a
program could help with a real-life task and spent a sizable
amount of time coding it from top to bottom, only to be
confronted with screens full of disheartening error messages
when you finally ran it. Or, worse, it gave you results that were
subtly wrong.
There are a few lessons we’ve all learned from experiences like
this. One is to start simple and keep it simple as you iterate on
the program. Another lesson is to test early and repeatedly.
Initially, this may just mean to run the program manually and
validate that it does what it should. Later on, if you break the
program into smaller parts, you can test those parts in isolation
and automatically. As a side effect, the program gets easier to
read and work on, too.
In this chapter, I’ll talk about how testing can help you produce
value early and consistently. Good tests amount to an
executable specification of the code you own. They set you free
from institutional knowledge in a team or company, and they
speed up your development by giving you immediate feedback
on changes.
NOTE
Pytest originated in the PyPy project, a Python interpreter written in Python. Early
on, the PyPy developers worked on a separate standard library called std , later
renamed to py . Its testing module py.test became an independent project under
the name pytest .
Writing a Test
Example 6-1 revisits the Wikipedia example from Chapter 3.
The program is as simple as it gets—yet it’s far from obvious
how you’d write tests for it. The main function has no inputs
and no outputs—only side effects, such as writing to the
standard output stream. How would you test a function like
this?
def main():
with urllib.request.urlopen(API_URL) as response:
data = json.load(response)
print(data["title"], end="\n\n")
print(textwrap.fill(data["extract"]))
import subprocess
import sys
def test_output():
args = [sys.executable, "-m", "random_wikipedia_a
g [ y _ p _
process = subprocess.run(args, capture_output=Tru
assert process.stdout
TIP
Tests written using pytest are functions whose names start with test . Use the built-
in assert statement to check for expected behavior. Pytest rewrites the language
construct to provide rich error reporting in case of a test failure.
random-wikipedia-article
├── pyproject.toml
├── src
│ └── random_wikipedia_article
│ ├── __init__.py
│ └── __main__.py
└── tests
├── __init__.py
└── test_main.py
[project.optional-dependencies]
tests = ["pytest>=8.1.1"]
$ py -m pytest
========================= test session starts =======
platform darwin -- Python 3.12.2, pytest-8.1.1, plugg
rootdir: ...
collected 1 item
tests/test_main.py .
========================== 1 passed in 0.01s ========
TIP
Use py -m pytest even in Poetry projects. It’s both shorter and safer than poetry
run pytest . If you forget to install pytest into the environment, Poetry falls back to
your global environment. (The safe variant would be poetry run python -m
pytest .)
NOTE
The term monkey patch for replacing code at runtime originated at Zope Corporation.
Initially, people at Zope called the technique “guerilla patching,” since it didn’t abide
by the usual rules of patch submission. People heard that as “gorilla patch”—and the
more refined versions soon became known as “monkey patches.”
import sys
from dataclasses import dataclass
@dataclass
class Article:
title: str = ""
summary: str = ""
def fetch(url):
with urllib.request.urlopen(url) as response:
data = json.load(response)
return Article(data["title"], data["extract"])
import io
from random_wikipedia_article import Article, show
def test_final_newline():
_ _ ()
article = Article("Lorem Ipsum", "Lorem ipsum dol
file = io.StringIO()
show(article, file)
assert file.getvalue().endswith("\n")
@pytest.fixture
def file():
return io.StringIO()
def test_final_newline(file):
article = Article("Lorem Ipsum", "Lorem ipsum dol
show(article, file)
assert file.getvalue().endswith("\n")
WARNING
If you forget to add the parameter file to the test function, you get a confusing
error: 'function' object has no attribute 'write' . This happens because the
3
name file now refers to the fixture function in the same module.
If every test used the same article, you’d likely miss some edge
cases. For example, you don’t want your program to crash if an
article comes with an empty title. Example 6-6 runs the test for
a number of articles with the @pytest.mark.parametrize
4
decorator.
articles = [
Article(),
Article("test"),
Article("Lorem Ipsum", "Lorem ipsum dolor sit ame
( p , p
Article(
"Lorem ipsum dolor sit amet, consectetur adip
"Nulla mattis volutpat sapien, at dapibus ips
),
]
@pytest.mark.parametrize("article", articles)
def test_final_newline(article, file):
show(article, file)
assert file.getvalue().endswith("\n")
If you parameterize many tests in the same way, you can create
a parameterized fixture, a fixture with multiple values
(Example 6-7). As before, pytest runs the test once for each
article in articles .
@pytest.fixture(params=articles)
def article(request):
return request.param
def parametrized_fixture(*params):
return pytest.fixture(params=params)(lambda reque
Use the helper to simplify the fixture from Example 6-7. You can
also inline the articles variable from Example 6-6:
import unittest
class TestShow(unittest.TestCase):
def setUp(self):
self.article = Article("Lorem Ipsum", "Lorem
self.file = io.StringIO()
def test_final_newline(self):
show(self.article, self.file)
self.assertEqual("\n", self.file.getvalue()[-
$ py -m unittest
.
-----------------------------------------------------
Ran 1 test in 0.000s
OK
TIP
If you have a test suite written with unittest , there’s no need to rewrite it to start
using pytest—pytest “speaks” unittest , too. Use pytest as a test runner right away
and you can rewrite your test suite incrementally later.
def test_fetch(article):
with serve(article) as url:
assert article == fetch(url)
@contextmanager
def serve(article):
... # start the server
yield f"http://localhost:{server.server_port}"
... # shut down the server
import http.server
import json
import threading
@contextmanager
def serve(article):
( )
data = {"title": article.title, "extract": articl
body = json.dumps(data).encode()
class Handler(http.server.BaseHTTPRequestHandler)
def do_GET(self):
self.send_response(200)
self.send_header("Content-Type", "applica
self.send_header("Content-Length", str(le
self.end_headers()
self.wfile.write(body)
@pytest.fixture(scope="session")
def httpserver():
...
That looks more promising, but how do you shut down the
server when the tests are done with it? Up to now, your fixtures
have only prepared a test object and returned it. You can’t run
code after a return statement. However, you can run code
after a yield statement—so pytest allows you to define a
fixture as a generator.
@pytest.fixture(scope="session")
def httpserver():
class Handler(http.server.BaseHTTPRequestHandler)
def do_GET(self):
article = self.server.article
data = {"title": article.title, "extract"
body = json.dumps(data).encode()
... # as before
@pytest.fixture
def serve(httpserver):
def f(article):
httpserver.article = article
return f"http://localhost:{httpserver.server_
return f
import httpx
def fetch(url):
fields = metadata("random-wikipedia-article")
headers = {"User-Agent": USER_AGENT.format_map(fi
@pytest.fixture
def serve(httpserver):
def f(article):
json = {"title": article.title, "extract": ar
httpserver.expect_request("/").respond_with_j
return httpserver.url_for("/")
p _ ( )
return f
$ py -m pytest -n auto
class ArticleFactory(Factory):
class Meta:
model = Article
title = Faker("sentence")
( )
summary = Faker("paragraph")
article = parametrized_fixture(*ArticleFactory.build_
Other Plugins
Summary
In this chapter, you’ve learned how to test your Python projects
with pytest:
Tests are functions that exercise your code and check for
expected behavior using the assert built-in. Prefix their
names—and the names of the containing modules—with
test_ , and pytest will discover them automatically.
Fixtures are functions or generators that set up and tear
down test objects; declare them with the @pytest.fixture
decorator. You can use a fixture in a test by including a
parameter named like the fixture.
Plugins for pytest can provide useful fixtures, as well as
modify test execution, enhance reporting, and much more.
If you want to know all about how to test with pytest, read
Brian’s book:
1
Large packages can have modules with the same name—say, gizmo.foo.registry
and gizmo.bar.registry . Under pytest’s default import mode, test modules must
have unique fully qualified names—so you must place the test_registry modules
in separate tests.foo and tests.bar packages.
2
Remember to add Rich to your project as described in “Specifying Dependencies for
a Project”. If you use Poetry, refer to “Managing Dependencies”.
3
My reviewer Hynek recommends a technique to avoid this pitfall and get an
idiomatic NameError instead. The trick is to name the fixture explicitly with
@pytest.fixture(name="file") . This lets you use a private name for the function,
such as _file , that doesn’t collide with the parameter.
4
Note the somewhat uncommon spelling variant parametrize instead of
parameterize.
5
Remember to add a dependency on httpx[http2] to your project.
6
The cookiecutter-pytest-plugin template gives you a solid project structure for
writing your own plugin.
7
Test double is the umbrella term for the various kinds of objects tests use in lieu of
the real objects used in production code. A good overview is “Mocks Aren’t Stubs” by
Martin Fowler, January 2, 2007.
Chapter 7. Measuring
Coverage with Coverage.py
How confident in a code change are you when your tests pass?
The specificity of your tests is the probability that they will pass
if the code is free of defects. If your tests are flaky (they fail
intermittently) or brittle (they fail when you change
implementation details), then you have low specificity.
Invariably, people stop paying attention to failing tests. This
chapter isn’t about specificity, though.
In short, coverage tools record each line in your code when you
run it. After completion, they report the overall percentage of
executed lines with respect to the entire codebase.
tests/test_main.py .....................
Using Coverage.py
Coverage.py is a mature and widely used code coverage tool for
Python. Created over two decades ago—predating PyPI and
setuptools—and actively maintained ever since, it has
measured coverage on every interpreter since Python 2.1.
Add coverage[toml] to your test dependencies (see “Managing
Test Dependencies”). The toml extra allows Coverage.py to
read its configuration from pyproject.toml on older interpreters.
Since Python 3.11, the standard library includes the tomllib
module for parsing TOML files.
[tool.coverage.run]
source = ["random_wikipedia_article", "tests"]
TIP
Measuring code coverage for your test suite may seem strange—but you should
always do it. It alerts you when tests don’t run and helps you identify unreachable
1
code within them. Treat your tests the same way you would treat any other code.
[tool.coverage.report]
show_missing = true
Run coverage report to show the coverage report in the
terminal:
$ py -m coverage report
Name Stmts Mi
-----------------------------------------------------
src/random_wikipedia_article/__init__.py 26
src/random_wikipedia_article/__main__.py 2
tests/__init__.py 0
tests/test_main.py 33
-----------------------------------------------------
TOTAL 61
36 def main():
37 article = fetch(API_URL) # missing
38 show(article, sys.stdout) # missing
[tool.coverage.run]
omit = ["*/__main__.py"]
If you run both steps again, Coverage.py will report full code
coverage. Let’s make sure you’ll notice any lines that aren’t
exercised by your tests. Configure Coverage.py to fail if the
percentage drops below 100% again:
[tool.coverage.report]
fail_under = 100
Branch Coverage
If an article has an empty summary, random-wikipedia-
article prints a trailing blank line (yikes). Those empty
summaries are rare, but they exist, and this should be a quick
fix. Example 7-1 modifies show to print only nonempty
summaries.
On the other hand, the tests exercised only one of two code
paths through the function—they never skipped the if body.
Coverage.py also supports branch coverage, which looks at all
the transitions between statements in your code and measures
the percentage of those traversed during the tests. You should
always enable it, as it’s more precise than statement coverage:
[tool.coverage.run]
branch = true
Re-run the tests, and you’ll see Coverage.py flag the missing
transition from the if statement on line 33 to the exit of the
function:
TOTAL 57 0 12 1 9
Coverage failure: total of 99 is less than fail-under
article = parametrized_fixture(
Article("test"), *ArticleFactory.build_batch(10)
)
Run the tests again—and they fail! Can you spot the bug in
Example 7-1?
[project]
requires-python = ">=3.7"
$ uv venv -p 3.7
$ uv pip compile --extra=tests pyproject.toml -o
× No solution found when resolving dependencies: ..
Parallel Coverage
If you now re-run Coverage.py under Python 3.7, it reports the
first branch of the if statement as missing. This makes sense:
your code executes the else branch and imports the backport
instead of the standard library.
$ uv venv -p 3.12
$ uv pip sync dev-requirements.txt
$ uv pip install -e . --no-deps
[tool.coverage.run]
parallel = true
Let’s put all of this together. For each Python version, set up the
environment and run the tests, as shown here for Python 3.7:
$ uv venv -p 3.7
$ uv pip sync py37-dev-requirements.txt
$ uv pip install -e . --no-deps
$ py -m coverage run -m pytest
$ py -m coverage combine
Combined data file .coverage.somehost.26719.001909
Combined data file .coverage.somehost.26766.146311
$ py -m coverage report
Measuring in Subprocesses
At the end of “Using Coverage.py”, you had to disable coverage
for the main function and the __main__ module. But the end-
to-end test certainly exercises this code. Let’s remove the #
pragma comment and the omit setting and figure this out.
It turns out you don’t need to. You can place a .pth file in the
environment that calls the function during interpreter startup.
This leverages a little-known Python feature (see “Site
Packages”): the interpreter executes lines starting with an
import statement in a .pth file.
Install a _coverage.pth file into the site-packages directory of
your environment, with the following contents:
$ export COVERAGE_PROCESS_START=pyproject.toml
Re-run the test suite, combine the data files, and display the
coverage report. Thanks to measuring coverage in the
subprocess, the program should have full coverage again.
NOTE
That doesn’t imply you should test every single line of code.
Consider a log statement for debugging a rare situation. The
statement may be difficult to exercise from a test. At the same
time, it’s probably low-risk, trivial code. Writing that test won’t
increase your confidence in the code significantly. Exclude the
line from coverage using a pragma comment:
if rare_condition:
print("got rare condition") # pragma: no cover
[tool.coverage.run]
source = ["random_wikipedia_article", "tests"]
branch = true
parallel = true
omit = ["*/__main__.py"] # avoid this if you can
[tool.coverage.report]
show_missing = true
fail_under = 100
Summary
You can measure the extent to which the test suite exercises
your project using Coverage.py. Coverage reports are useful for
discovering untested lines. Branch coverage captures the
control flow of your program instead of isolated lines of source
code. Parallel coverage lets you measure coverage across
multiple environments. You need to combine the data files
before reporting. Measuring coverage in subprocesses requires
setting up a .pth file and an environment variable.
1
Ned Batchelder, “You Should Include Your Tests in Coverage”, August 11, 2020.
2
Under the hood, the .coverage file is just a SQLite database. Feel free to poke around
if you have the sqlite3 command-line utility ready on your system.
3
The name parallel is somewhat misleading; the setting has nothing to do with
parallel execution.
4
Martin Fowler, “Legacy Seam”, January 4, 2024.
Chapter 8. Automation with
Nox
import nox
@nox.session
def tests(session):
session.install(".", "pytest")
session.run("pytest")
You can try the session with the example project from previous
chapters. For now, add your test dependencies to the
session.install arguments:
$ nox
nox > Running session tests
nox > Creating virtual environment (virtualenv) using
nox > python -m pip install . pytest
nox > pytest
========================= tests session starts ======
...
========================== 21 passed in 0.94s =======
nox > Session tests was successful.
As you can see from the output, Nox starts by creating a virtual
environment for the tests session using virtualenv . If
you’re curious, you can find this environment under the .nox
directory in your project.
NOTE
By default, environments use the same interpreter as Nox itself. In “Working with
Multiple Python Interpreters”, you’ll learn how to run sessions on another
interpreter, and even across multiple ones.
First, the session installs the project and pytest into its
environment. The function session.install is just pip
install underneath. You can pass any appropriate options and
arguments to pip. For example, you can install your
dependencies from a requirements file:
session.install("-r", "dev-requirements.txt")
session.install(".", "--no-deps")
session.install(".[tests]")
Above, you’ve used session.install(".") to install your
project. Behind the scenes, pip builds a wheel using the build
backend you’ve specified in pyproject.toml. Nox runs in the
directory containing noxfile.py, so this command assumes both
files are in the same directory.
Nox lets you use uv instead of virtualenv and pip for creating
environments and installing packages. You can switch the
backend to uv by setting an environment variable:
$ export NOX_DEFAULT_VENV_BACKEND=uv
import shutil
f thlib i t P th
from pathlib import Path
@nox.session
def build(session):
session.install("build", "twine")
distdir = Path("dist")
if distdir.exists():
shutil.rmtree(distdir)
Example 8-2 relies on the standard library for clearing out stale
packages and locating the freshly built ones: Path.glob
matches files against wildcards, and shutil.rmtree removes a
directory and its contents.
TIP
Nox doesn’t implicitly run commands in a shell, unlike tools such as make . Shells
differ widely between platforms, so they’d make Nox sessions less portable. For the
same reason, avoid Unix utilities like rm or find in your sessions—use Python’s
standard library instead!
nox.options.error_on_external_run = True
@nox.session
def build(session):
session.install("twine")
session.run("poetry", "build", external=True)
session.run("twine", "check", *Path().glob("dist/
You’re trading off reliability for speed here. Example 8-2 works
with any build backend declared in pyproject.toml and installs it
in an isolated environment on each run. Example 8-4 assumes
that contributors have a recent version of Poetry on their
system and breaks if they don’t. Prefer the first method unless
every developer environment has a well-known Poetry version.
$ nox --list
Run the checks and tasks for this project.
nox.options.sessions = ["tests"]
Now, when you run nox without arguments, only the tests
session runs. You can still select the build session using the --
session option. Command-line options override values
1
specified in nox.options in noxfile.py.
TIP
Keep your default sessions aligned with the mandatory checks for your project.
Contributors should be able to run nox without arguments to check if their code
changes are acceptable.
Every time a session runs, Nox creates a fresh virtual
environment and installs the dependencies. This is a good
default, because it makes the checks strict, deterministic, and
repeatable. You won’t miss problems with your code due to
stale packages in the session environment.
$ nox -R
nox > Running session tests
nox > Re-using existing virtual environment at .nox/t
nox > pytest
...
nox > Session tests was successful.
$ nox
nox > Running session tests-3.12
nox > Creating virtual environment (virtualenv) using
nox > python -m pip install '.[tests]'
nox > pytest
...
nox > Session tests-3.12 was successful.
nox > Running session tests-3.11
...
nox > Running session tests-3.10
g
...
nox > Ran multiple sessions:
nox > * tests-3.12: success
TIP
Did you get errors from pip when you ran Nox just now? Don’t use the same
compiled requirements file for every Python version. You need to lock dependencies
separately for each environment (see “Session Dependencies”).
Session Arguments
So far, the tests session runs pytest without arguments:
session.run("pytest")
session.run("pytest", "--verbose")
But you don’t always want the same options for pytest. For
example, the --pdb option launches the Python debugger on
test failures. The debug prompt can be a lifesaver when you
investigate a mysterious bug. But it’s worse than useless in a CI
context: it would hang forever since there’s nobody to enter
commands. Similarly, when you work on a feature, the -k
option lets you run tests with a specific keyword in their name
—but you wouldn’t want to hardcode it in noxfile.py either.
Automating Coverage
Coverage tools give you a sense of how much your tests exercise
the codebase (see Chapter 7). In a nutshell, you install the
coverage package and invoke pytest via coverage run .
Example 8-7 shows how to automate this process with Nox.
[tool.coverage.paths]
source = ["src", "*/site-packages"]
@nox.session
def coverage(session):
session.install("coverage[toml]")
if any(Path().glob(".coverage.*")):
session.run("coverage", "combine")
session.run("coverage", "report")
NOTE
The coverage session still reports missing coverage for the main function and the
__main__ module. You’ll take care of that in “Automating Coverage in Subprocesses”.
Session Notification
As it stands, this noxfile.py has a subtle problem. Until you run
the coverage session, your project will be littered with data
files waiting to be processed. And if you haven’t run the tests
session recently, the data in those files may be stale—so your
coverage report won’t reflect the latest state of the codebase.
Example 8-9 triggers the coverage session to run automatically
after the test suite. Nox supports this with the session.notify
method. If the notified session isn’t already selected, it runs
after the other sessions have completed.
Automating Coverage in
Subprocesses
Alan Kay, a pioneer in object-oriented programming and
graphical user interface design, once said, “Simple things
2
should be simple; complex things should be possible.” Many
Nox sessions will be two-liners: a line to install dependencies
and a line to run a command. Yet some automations require
more complex logic, and Nox excels at those, too—primarily by
staying out of your way and deferring to Python as a general-
purpose programming language.
First, you need to determine the location for the .pth file. The
directory is named site-packages, but the exact path depends on
your platform and the Python version. Instead of guessing, you
can query the sysconfig module for it:
sysconfig.get_path("purelib")
If you called the function directly in your session, it would
return a location in the environment where you’ve installed
Nox. Instead, you need to query the interpreter in the session
environment. You can do this by running python with
session.run :
output = session.run(
"python",
"-c",
"import sysconfig; print(sysconfig.get_path('pure
silent=True,
)
purelib = Path(output.strip())
(purelib / "_coverage.pth").write_text(
"import coverage; coverage.process_startup()"
)
def install_coverage_pth(session):
output = session.run(...) # see above
purelib = Path(output.strip())
(purelib / "_coverage.pth").write_text(...) # se
try:
args = ["coverage", "run", "-m", "pytest", *s
session.run(*args, env={"COVERAGE_PROCESS_STA
finally:
session.notify("coverage")
y( g )
Parameterizing Sessions
The phrase “works for me” describes a common story: a user
reports an issue with your code, but you can’t reproduce the
bug in your environment. Runtime environments in the real
world differ in a myriad of ways. Testing across Python
versions covers one important variable. Another common cause
of surprise is the packages that your project uses directly or
indirectly—its dependency tree.
@nox.session
@nox.parametrize("django", ["5.*", "4.*", "3.*"])
def tests(session, django):
session.install(".", "pytest-django", f"django=={
session.run("pytest")
@nox.session
@nox.parametrize("a", ["1.0", "0.9"])
@nox.parametrize("b", ["2.2", "2.1"])
def tests(session, a, b):
print(a, b) # all combinations of a and b
@nox.session
@nox.parametrize(["a", "b"], [("1.0", "2.2"), ("0.9",
def tests(session, a, b):
print(a, b) # only the combinations listed above
When running a session across Python versions, you’re
effectively parameterizing the session by the interpreter. In
fact, Nox lets you write the following instead of passing the
4
versions to @nox.session :
@nox.session
@nox.parametrize("python", ["3.12", "3.11", "3.10"])
def tests(session):
...
@nox.session
@nox.parametrize(
["python", "django"],
[
(python, django)
for python in ["3.12", "3.11", "3.10"]
py [ , , ]
for django in ["3.2.*", "4.2.*"]
if (python, django) not in [("3.12", "3.2.*")
]
)
def tests(session, django):
...
Session Dependencies
If you followed Chapter 4 closely, you may see some problems
with the way Examples 8-8 and 8-11 install packages. Here are
the relevant parts again:
@nox.session
def tests(session):
session.install(".[tests]")
...
@nox.session
def coverage(session):
session.install("coverage[toml]")
...
On the other hand, lock file updates are a constant churn, and
they clutter your Git history. Reducing their frequency comes at
the price of running checks with stale dependencies. If you
don’t require locking for other reasons, such as secure
deployments—and you’re happy to quickly fix a build when an
incompatible release wreaks havoc on your CI—you may prefer
to keep your dependencies unlocked. There’s no such thing as a
free lunch.
In “Development Dependencies”, you grouped dependencies in
extras and compiled requirements files from each. In this
section, I’ll show you a lighter-weight method for locking:
constraints files. You need only a single extra for it. It also
doesn’t require installing the project itself, as extras usually do
—which helps with the coverage session.
Don’t forget to commit the constraints file to source control. You need to share this
file with every contributor, and it needs to be available in CI.
@nox.session
def coverage(session):
session.install("-c", "constraints.txt", "coverag
...
def constraints(session):
filename = f"python{session.python}-{sys.platform
return Path("constraints") / filename
@nox.session(python="3.12")
def coverage(session):
session.install("-c", constraints(session), "cove
...
Before I show you how to use Poetry in Nox sessions, let me call
out a couple of differences between Poetry environments and
Nox environments.
Example 8-19 puts this logic into a helper function that you can
share across your session functions.
session.run_install(
"poetry",
"install",
"--no-root",
"--sync",
f"--only={','.join(groups)}",
external=True,
)
if root:
session.install(".")
[tool.poetry.group.tests.dependencies]
pytest = ">=7.4.4"
And here’s what the coverage session looks like with the
helper function:
@nox.session
def coverage(session):
install(session, groups=["coverage"], root=False)
...
TIP
How does Poetry know to use a Nox environment instead of the Poetry environment?
Poetry installs packages into the active environment, if one exists. When Nox runs
Poetry, it activates the session environment by exporting the VIRTUAL_ENV
environment variable (see “Virtual Environments”).
Summary
Nox lets you automate checks and tasks for a project. Its Python
configuration file noxfile.py organizes them into one or more
sessions. Sessions are functions decorated with @nox.session .
They receive a single argument session providing the session
API (Table 8-1). Every session runs in an isolated virtual
environment. If you pass a list of Python versions to
@nox.session , Nox runs the session across all of them.
There’s a lot more to Nox that this chapter didn’t cover. For
example, you can use Conda or Mamba to create environments
and install packages. You can organize sessions using keywords
and tags, and assign friendly identifiers using nox.param . Last
but not least, Nox comes with a GitHub Action that makes it
easy to run Nox sessions in CI. Take a look at the official
documentation to learn more.
1
In case you’re wondering, always use the plural form nox.options.sessions in
noxfile.py. On the command line, both --session and --sessions work. You can
specify any number of sessions with these options.
2
Alan Kay, “What Is the Story Behind Alan Kay’s Adage ‘Simple Things Should Be
Simple, Complex Things Should Be Possible’?”, Quora Answer, June 19, 2020.
3
Like pytest, Nox uses the alternate spelling “parametrize” to protect your “E” keycap
from excessive wear.
4
The eagle-eyed reader may notice that python is not a function parameter here. If
you do need it in the session function, use session.python instead.
5
Here, Semantic Versioning constraints harm more than help. Bugs occur in all
releases, and your upstream’s definition of a breaking change may be narrower than
you like. See Hynek Schlawack, “Semantic Versioning Will Not Save You”, March 2,
2021.
6
Run poetry lock --no-update after editing pyproject.toml to update the
poetry.lock file.
Chapter 9. Linting with Ruff
and pre-commit
Linters don’t run a program to discover issues with it; they read
and analyze its source code. This process is known as static
analysis, as opposed to runtime (or dynamic) analysis. It makes
linters both fast and safe—you needn’t worry about side effects,
such as requests to production systems. Static checks can be
smart and also fairly complete—you needn’t hit the right
combination of edge cases to dig up a latent bug.
NOTE
Static analysis is powerful, but you should still write tests for your programs. Where
static checks use deduction, tests use observation. Linters verify a limited set of
generic code properties, while tests can validate that a program satisfies its
requirements.
Linters are also great at enforcing a readable and consistent
style, with a preference for idiomatic and modern constructs
over obscure and deprecated syntax. Organizations have
adopted style guides for years, such as the recommendations in
PEP 8 or the Google Style Guide for Python. Linters can function
as executable style guides: by flagging offending constructs
automatically, they keep code review focused on the meaning of
a change rather than stylistic nitpicks.
But first, let’s look at a typical problem that linters help you
solve.
Linting Basics
The constructs flagged by linters may not be outright illegal.
More often, they just trigger your spider-sense that something
might be wrong. Consider the Python code in Example 9-1.
import subprocess
Linters can detect pitfalls like this, warn you about them, and
even fix them for you. Let’s use a linter named Ruff on the
function—you’ll hear a lot more about it in this chapter. For
1
now, just note its error message, which identifies the bug:
TIP
If you manage your project with Rye, the Ruff linter and code formatter are available
under the commands rye lint and rye fmt , respectively.
But wait—how can pipx install a Rust program? The Ruff binary
is available as a wheel on PyPI, so Python folks like you and me
can install it with good old pip and pipx. You could even run it
with py -m ruff .
Run the command ruff check —the front-end for Ruff’s linter.
Without arguments, the command lints every Python file under
your current directory, unless it’s listed in a .gitignore file:
$ ruff check
example.py:2:12: F541 [*] f-string without any placeh
Found 1 error.
[*] 1 fixable with the `--fix` option.
What it does
Checks for f-strings that do not contain any placehol
$ ruff linter
F Pyflakes
E/W pycodestyle
C90 mccabe
I isort
N pep8-naming
D pydocstyle
UP pyupgrade
... (50+ more lines)
You can activate linters and individual rules for your project in
its pyproject.toml file. The setting tool.ruff.lint.select
enables any rules whose code starts with one of the given
prefixes. Out of the box, Ruff enables some basic all-around
checks from Pyflakes and pycodestyle:
[tool.ruff.lint]
select = ["E4", "E7", "E9", "F"]
If you aren’t using an opinionated code formatter, consider enabling the entire E
and W blocks. Their automatic fixes help ensure minimal PEP 8 compliance. They’re
similar to, but not yet as feature-complete as, the autopep8 formatter (see
3
“Approaches to Code Formatting: autopep8”).
Ruff has too many rules to describe in this book, and more are
being added all the time. How do you find the good ones for
your project? Try them out! Depending on your project, you
may want to enable individual rules ( "B006" ), groups of rules
( "E4" ), entire plugins ( "B" ), or even every existing plugin at
the same time ( "ALL" ).
WARNING
Reserve the special ALL code for experimentation: it will implicitly enable new
linters whenever you upgrade Ruff. Beware: some plugins require configuration to
4
produce useful results, and some rules conflict with other rules.
TIP
Work toward enforcing the same set of linters for all your projects, with the same
configurations, and prefer default configurations over customizations. This will make
your codebase more consistent and accessible across the entire organization.
The select setting is flexible but purely additive: it lets you opt
into rules whose code starts with a given prefix. The ignore
setting lets you fine-tune in the other direction: it disables
individual rules and rule groups. Like select , it matches rule
codes by their prefixes.
The subtractive method is handy when you need most, but not
all, of a linter’s rules, and when you’re adopting a linter
gradually. The pydocstyle plugin ( D ) checks that every
module, class, and function has a well-formed docstring. Your
project may be almost there, with the exception of module
docstrings ( D100 ). Use the ignore setting to disable all
warnings about missing module docstrings until you’ve fully
onboarded your project:
[tool.ruff.lint]
select = ["D", "E", "F"]
ignore = ["D100"] # Don't require module docstrings
[tool.ruff.lint.per-file-ignores]
"tests/*" = ["S101"] # Tests can use assertions.
Disabling rules should be a last resort. It’s usually better to
suppress individual warnings by adding a special comment to
offending lines. This comment has the form # noqa: followed
by one or more rule codes.
WARNING
Always include rule codes in your noqa comments. Blanket noqa comments can
hide unrelated issues. Marking violations also makes them easier to find when you’re
ready to fix them. Use the rule PGH004 from the pygrep-hooks linter to require rule
codes.
Here’s a Nox session that runs Ruff on every Python file in the
current directory:
@nox.session
def lint(session):
session.install("ruff")
session.run("ruff", "check")
repos:
- repo: https://github.com/astral-sh/ruff-pre-commi
rev: v0.4.4
hooks:
- id: ruff
When you run a hook for the first time, pre-commit clones the
hook repository and installs the linter into an isolated
environment. This can take some time, but you don’t have to do
it often: pre-commit caches the linter environments across
multiple projects.
A Hook Up Close
If you’re curious how a pre-commit hook works under the
hood, take a peek at Ruff’s hook repository. The file .pre-
commit-hooks.yaml in the repository defines the hooks.
Example 9-3 shows an excerpt from the file.
- id: ruff
name: ruff
language: python
entry: ruff check --force-exclude
args: []
types_or: [python, pyi]
The hook definition tells pre-commit how to install and run the
linter by specifying its implementation language ( language )
and its command and command-line arguments ( entry and
args ). The Ruff hook is a Python package, so it specifies Python
as the language. The --force-exclude option ensures that you
can exclude files from linting. It tells Ruff to honor its exclude
setting even when pre-commit passes excluded source files
explicitly.
TIP
You can override the args key in your .pre-commit-config.yaml file to pass custom
command-line options to a hook. By contrast, command-line arguments in the entry
key are mandatory—you can’t override them.
Figure 9-2. Three projects with pre-commit hooks for Ruff, Black, and Flake8.
Automatic Fixes
WARNING
Automatic fixes bring tremendous benefits, but they assume some basic Git hygiene:
don’t pile up uncommitted changes in your repository (or stash them before linting).
Pre-commit saves and restores your local modifications in some contexts, but not all.
Let’s try this out. When Ruff detects the mutable argument
default, it indicates that you can enable a “hidden” fix. (Ruff
asks you to opt into the fix because people might conceivably
depend on mutable defaults, say, for caching.) First, enable the
linter rule and the fix in pyproject.toml:
[tool.ruff.lint]
extend-select = ["B006"]
extend-safe-fixes = ["B006"]
Ruff’s pre-commit hook requires you to opt in with the --fix
option, as shown in Example 9-4. The options --show-fixes
and --exit-non-zero-on-fix ensure that all violations are
displayed in the terminal and result in a nonzero exit status,
even if Ruff was able to fix them.
repos:
- repo: https://github.com/astral-sh/ruff-pre-commi
rev: v0.4.4
hooks:
- id: ruff
args: ["--fix", "--show-fixes", "--exit-non-z
Save Example 9-1 in a file called bad.py, commit the file, and
run pre-commit:
Fixed 1 error:
- bad.py:
1 B006 ( t bl t d f lt)
1 × B006 (mutable-argument-default)
If you inspect the modified file, you’ll see that Ruff has replaced
the argument default with None . The empty list is now assigned
inside the function, giving every call its own instance of args :
Instead of inspecting the modified files, you can also run git
diff to see the changes applied to your code. Alternatively, you
can tell pre-commit to show you a diff of the fixes right away,
using the option --show-diff-on-fail .
That said, you still should include a Nox session for pre-commit
itself. This ensures that you can run all the checks for your
project with a single command, nox . Example 9-5 shows how to
define the session. If your noxfile.py sets nox.options
sions , add the session to that list, as well.
.ses
@nox.session
def lint(session):
options = ["--all-files", "--show-diff-on-fail"]
session.install("pre-commit")
session.run("pre-commit", "run", *options, *sessi
But passing this option is easy to forget. If you use Git hooks
other than pre-commit, list them in the .pre-commit-config.yaml
file instead:
$ git commit -n
Git hooks control which changes enter your local repository, but
they’re voluntary—they don’t replace CI checks as a gatekeeper
for the default branch in your shared repository. If you already
run Nox in CI, the session in Example 9-5 takes care of that.
def create_frobnicator_factory(the_factory_name,
interval_in_secs=10
use_singleton=None,fr
if dbg:print('creating frobnication factory '+the_f
if(use_singleton): return _frob_sngltn #we'r
return FrobnicationFactory( the_factory_name,
def create_frobnicator_factory(the_factory_name,
interval_in_secs=100,
use singleton=None fr
use_singleton=None, fr
if dbg:
print('creating frobnication factory '+the_fa
if (use_singleton):
return _frob_sngltn # we're done
return FrobnicationFactory(the_factory_name,
intrvl=interval_in_sec
You’ll likely find this easier on the eye. For better or worse,
autopep8 didn’t touch some other questionable stylistic choices,
such as the rogue blank line in the return statement and the
inconsistent quote characters. Autopep8 uses pycodestyle to
detect issues, and pycodestyle had no complaint here.
TIP
Unlike most code formatters, autopep8 lets you apply selected fixes by passing --
select with appropriate rule codes. For example, you can run autopep8 --
select=E111 to enforce four-space indentation.
def create_frobnicator_factory(the_factory_name,
interval_in_secs=100,
dbg=False,
use_singleton=None,
frobnicate_factor=4.5)
if dbg: print('creating frobnication factory ' +
if (use_singleton): return _frob_sngltn #we're d
return FrobnicationFactory(the_factory_name,
intrvl=interval_in_sec
f=frobnicate_factor)
def create_frobnicator_factory(
the_factory_name,
interval_in_secs=100,
dbg=False,
use_singleton=None,
frobnicate_factor=4.5,
):
if dbg:
print("creating frobnication factory " + the_
if use_singleton:
return _frob_sngltn # we're done
return FrobnicationFactory(
the_factory_name, intrvl=interval_in_secs, f=
)
Black took the Python world by storm, with project after project
deciding to “blacken” their source files.
ONBOARDING A CODEBASE TO BLACK
Second, are the changes safe? Black guarantees that the abstract
syntax tree (AST) of the source code—that is, the parsed
representation of the program, as seen by the interpreter—
doesn’t change, except for some well-known divergences that
10
preserve semantic equivalence.
Third, when you commit the changes, how do you prevent them
from cluttering up the output of git blame ? It turns out that
you can configure Git to ignore the commit when annotating
files. Store the full 40-character commit hash in a file named
.git-blame-ignore-revs in the root of the repository. Then run the
following command:
$ git config blame.ignoreRevsFile .git-blame-igno
NOTE
Reducing dependencies between edits helps different people work on the same code.
But it also lets you separate or reorder drive-by bugfixes or refactorings and back out
tentative commits before submitting your changes for code review.
Black takes some cues from the formatted source code besides
comments. One example is the blank lines that divide a
function body into logical partitions. Likely the most powerful
way of affecting Black’s output, however, is the magic trailing
comma: if a sequence contains a trailing comma, Black splits its
elements across multiple lines, even if they would fit on a single
line.
@pytest.mark.parametrize(
("value", "expected"),
[
("first test value", "61df19525cf97aa38
("another test value", "5768979c48c30998c
("and here's another one", "e766977069039d83f
]
) # fmt: skip
Ruff aims for full compatibility with the Black code style. Unlike
Black, Ruff lets you opt into single quotes and indentation using
tabs. However, I’d recommend adhering to Black’s widely
adopted style.
repos:
- repo: https://github.com/astral-sh/ruff-pre-commi
rev: v0.4.4
hooks:
- id: ruff
args: ["--fix", "--show-fixes", "--exit-non-z
- id: ruff-format
The pre-commit hook for the code formatter comes last. This
gives it an opportunity to reformat any automatic fixes made by
linters.
Summary
In this chapter, you’ve seen how to improve and preserve the
code quality in your projects using linters and code formatters.
Ruff is an efficient reimplementation of many Python code-
quality tools in Rust, including Flake8 and Black. While it’s
possible to run Ruff and other tools manually, you should
automate this process and include it as a mandatory check in
CI. One of the best options is pre-commit, a cross-language
linter framework with Git integration. Invoke pre-commit from
a Nox session to keep a single entry point for your suite of
checks.
1
The B short code activates a group of checks pioneered by flake8-bugbear , a
plugin for the Flake8 linter.
2
Charlie Marsh, “Python Tooling Could Be Much, Much Faster”, August 30, 2022.
3
As of this writing, you’ll also need to enable Ruff’s preview mode. Set
tool.ruff.lint.preview to true .
4
My reviewer Hynek points out that setting your project to ALL can be a great way to
learn about antipatterns from experienced people. You can still opt out of a rule after
understanding its rationale. Enabling ALL requires a bit more work on your side, but
it ensures you don’t miss new rules in a Ruff release.
5
What’s wrong with assertions? Nothing, but Python skips them when run with -O
for optimizations— a common way to speed up production environments. So don’t
use assert to validate untrusted input!
6
Kent Beck and Martin Fowler describe code smells as “certain structures in the code
that suggest—sometimes, scream for—the possibility of refactoring.” Martin Fowler,
Refactoring: Improving the Design of Existing Code, second edition (Boston: Addison-
Wesley, 2019).
7
Running pre-commit from Git is the safest way to run linters with automatic fixes:
pre-commit saves and restores any changes you haven’t staged, and it rolls back the
fixes if they conflict with your changes.
8
Charlie Marsh, “The Ruff Formatter”, October 24, 2023.
9
Stephen C. Johnson, the author of Lint, also established this infamous naming
convention by writing Yacc (Yet Another Compiler-Compiler) in the early 1970s at
Bell Labs.
0
“AST Before and After Formatting”, Black documentation. Last accessed: March 22,
2024.
1
You can inspect the AST of a source file with the standard ast module, using py -m
ast example.py .
Chapter 10. Using Types for
Safety and Inspection
import math
>>> math.sqrt("1.21")
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: must be real number, not str
Python isn’t as forgiving as some of its contemporaries, but
consider two limitations of this type check. First, you won’t see
a TypeError until you run the offending code. Second, the
Python interpreter doesn’t raise the error—the library function
checks explicitly if something other than an integer or floating-
point number was passed.
def join_all(joinables):
for task in joinables:
task.join()
You can use this function with Thread or Process from the
standard threading or multiprocessing modules—or with
any other object that has a join method with the correct
signature. (You can’t use it with strings because str.join takes
an argument—an iterable of strings.) Duck typing means that
these classes don’t need a common base class to benefit from
reuse. All the types need is a join method with the correct
signature.
Duck typing is great because the function and its callers can
evolve fairly independently—a property known as loose
coupling. Without duck typing, a function argument has to
implement an explicit interface that specifies its behavior.
Python gives you loose coupling for free: you can pass literally
anything, as long as it satisfies the expected behavior.
If you use type annotations in your own code, you reap more
benefits. First, you’re also a user of your own functions, classes,
and modules—so all the benefits previously mentioned apply,
like auto-completion and type checking. Additionally, you’ll find
it easier to reason about your code, refactor it without
introducing subtle bugs, and build a clean software
architecture. When you are a library author, typing lets you
specify an interface contract your users can rely on, while
you’re free to evolve the implementation.
Even a decade after their introduction, type annotations aren’t
free of controversy—maybe understandably so, given Python’s
proud stance as a dynamically typed language. Adding types to
existing code poses similar challenges as introducing unit tests
to a codebase that wasn’t written with testing in mind. Just as
you may need to refactor for testability, you may need to
refactor for “typability”—replacing deeply nested primitive
types and highly dynamic objects with simpler and more
predictable types. You’ll likely find it worth the effort.
In this chapter, you’ll learn how to verify the type safety of your
Python programs using the static type checker mypy and the
runtime type checker Typeguard. You’ll also see how runtime
inspection of type annotations can greatly enhance the
functionality of your programs. But first, let’s take a look at the
typing language that has evolved within Python over the past
decade.
mypy Playground
Pyright Playground
Pyre Playground
Variable Annotations
You can annotate a variable with the type of values that it may
be assigned during the course of the program. The syntax for
such type annotations consists of the variable name, a colon,
and a type:
answer: int = 42
Besides the simple built-in types like bool , int , float , str ,
or bytes , you can also use standard container types in type
annotations, such as list , tuple , set , or dict . For example,
here’s how you might initialize a variable used to store a list of
lines read from a file:
lines: list[str] = []
Any class you define in your own Python code is also a type:
class Parrot:
pass
class NorwegianBlue(Parrot):
pass
TIP
Typing rules also permit assignments if the type on the right is consistent with that
on the left. This lets you assign an int to a float , even though int isn’t derived
from float . The Any type is consistent with any other type (see “Gradual Typing”).
Union Types
if readme.exists():
description = readme.read_text()
The good news is that type checkers can warn you when you’re
using a variable that’s potentially None . This can greatly reduce
the risk of crashes in production systems.
How do you tell the type checker that your use of description
is fine? Generally, you should just check that the variable isn’t
None . The type checker will pick up on this and allow you to
use the variable:
If you already know that the value has the right type, you can
help out the type checker using the cast function from the
typing module:
Gradual Typing
number: object = 2
print(number + number) # error: Unsupported left ope
There’s another type in Python that, like object , can hold any
value. It’s called Any (for obvious reasons), and it’s available
from the standard typing module. When it comes to behavior,
Any is object ’s polar opposite. You can invoke any operation
on a value of type Any —conceptually, it behaves like the
intersection of all possible types. Any serves as an escape hatch
that lets you opt out of type checking for a piece of code:
WARNING
When you’re working in typed Python code, watch out for Any . It can disable type
checking to a surprising degree. For example, if you access attributes or invoke
operations on Any values, you’ll end up with more Any values.
The Any type is Python’s hat trick that lets you restrict type
checking to portions of a codebase—formally known as gradual
typing. In variable assignments and function calls, Any is
consistent with every other type, and every type is consistent
with it.
Function Annotations
import subprocess
from typing import Any
Annotating Classes
class Swallow:
def __init__(self, velocity: float) -> None:
self.velocity = velocity
import math
from dataclasses import dataclass
@dataclass
class Point:
x: float
y: float
@dataclass
class Point:
def distance(self, other: "Point") -> float:
...
@dataclass
class Point:
def distance(self, other: Point) -> float:
...
The third method does not help with all forward references, but
it does here. You can use the special Self type to refer to the
current class:
@dataclass
class Point:
def distance(self, other: Self) -> float:
...
Type Aliases
5
You can use the type keyword to introduce an alias for a type:
Generics
Here’s how you might use the generic function in your code:
NOTE
Generics with the [T] syntax are supported in Python 3.12+ and the Pyright type
checker. If you get an error, omit the [T] suffix from first and use TypeVar from
the typing module:
T = TypeVar("T")
Protocols
class Joinable(Protocol):
def join(self) -> None: ...
If you use Poetry, add mypy to your project using poetry add :
$ py -m mypy src
Success: no issues found in 2 source files
import textwrap
$ py -m mypy example.py
example.py:5: error: Argument 1 to "fill" has incompa
expected "str" [arg-type]
Found 1 error in 1 file (checked 1 source file)
You could store an empty string when this happens. But let’s be
principled: an empty summary isn’t the same as no summary at
all. Let’s store None when the response omits the field.
@dataclass
class Article:
title: str = ""
summary: str | None = None
Presumably, mypy will balk at this error, just like it did above.
Yet, when you run it on the file, it’s all sunshine. Can you guess
why?
$ py -m mypy src
Success: no issues found in 2 source files
Strict Mode
$ py -m mypy src
__init__.py:16: error: Function is missing a type ann
__init__.py:22: error: Function is missing a type ann
__init__.py:27: error: Function is missing a return t
__init__.py:27: note: Use "-> None" if function does
__init__.py:28: error: Call to untyped function "fetc
__init__.py:29: error: Call to untyped function "show
__main__.py:3: error: Call to untyped function "main"
Found 6 errors in 2 files (checked 2 source files)
import json
import sys
import textwrap
import urllib.request
from dataclasses import dataclass
from typing import Final, TextIO
@dataclass
class Article:
title: str = ""
summary: str = ""
TIP
My other favorite mypy setting in pyproject.toml is the pretty flag. It displays source
snippets and indicates where the error occurred:
[tool.mypy]
pretty = true
Let mypy’s strict mode be your North Star when adding types to
an existing Python codebase. Mypy gives you an arsenal of
finer- and coarser-grained ways to relax type checking when
you’re not ready to fix a type error.
Your first line of defense is a special comment of the form #
type: ignore . Always follow it with the error code in square
brackets. For example, here’s a line from mypy’s previous
output with the error code included:
[tool.mypy."<module>"]
allow_untyped_calls = true
[tool.mypy]
allow_untyped_calls = true
You can even disable all type errors for a given module:
[tool.mypy."<module>"]
ignore_errors = true
import nox
@nox.session(python=["3.12", "3.11", "3.10"])
def mypy(session: nox.Session) -> None:
session.install(".[typing]")
session.run("mypy", "src")
Just like you run the test suite across all supported Python
versions, you should also type-check your project on every
Python version. This practice is fairly effective at ensuring that
your project is compatible with those versions, even when your
test suite doesn’t exercise that one code path where you forgot
about backward compatibility.
NOTE
You can also pass the target version using mypy’s --python-version option.
However, installing the project on each version ensures that mypy checks your
project against the correct dependencies. These may not be the same on all Python
versions.
import sys
import sys
You may wonder why the Nox session in Example 10-5 installs
the project into mypy’s virtual environment. By nature, a static
type checker operates on source code; it doesn’t run your code.
So why install anything but the type checker itself?
Rich and httpx are, in fact, fully type annotated. They include
an empty marker file named py.typed next to their source files.
When you install the packages into a virtual environment, the
marker file allows static type checkers to locate their types.
[tool.mypy.<package>]
ignore_missing_imports = true
NOTE
Python’s standard library doesn’t include type annotations. Type checkers vendor the
third-party package typeshed for standard library types, so you don’t have to worry
about supplying them.
Treat your tests like you would treat any other code. Type
checking your tests helps you detect when they use your
project, pytest, or testing libraries incorrectly.
TIP
Running mypy on your test suite also type-checks the public API of your project. This
can be a good fallback when you’re unable to fully type your implementation code
for every supported Python version.
The test suite imports your package from the environment. The
type checker therefore expects your package to distribute type
information. Add an empty py.typed marker file to your import
package, next to the __init__ and __main__ modules (see
“Distributing Types with Python Packages”).
There isn’t anything inherently special about typing a test suite.
Recent versions of pytest come with high-quality type
annotations. These help when your tests use one of pytest’s
built-in fixtures. Many test functions don’t have arguments and
return None . Here’s a slightly more involved example using a
fixture and test from Chapter 6:
import io
import pytest
from random_wikipedia_article import Article, show
@pytest.fixture
def file() -> io.StringIO:
return io.StringIO()
import sys
Recall that the fetch function instantiates the class like this:
8
The Zen of Python says, “Special cases aren’t special enough to
break the rules.” Dataclasses make no exception to this
principle: they’re plain Python classes without any secret sauce.
Given that the class doesn’t define the method itself, there’s
only one possible origin for it: the @dataclass class decorator.
In fact, the decorator synthesizes the __init__ method on the
fly, along with several other methods, using your type
annotations! Don’t take my word for it, though. In this section,
you’re going to write your own miniature @dataclass
decorator.
WARNING
Don’t use this in production! Use the standard dataclasses module, or better: the
attrs library. Attrs is an actively maintained, industry-strength implementation
with better performance, a clean API, and additional features, and it directly inspired
dataclasses .
The typing language allows you to refer to, say, the str class by
writing type[str] . You can read this aloud as “the type of a
string.” (You can’t use str on its own here. In a type
annotation, str just refers to an individual string.) A class
decorator should work for any class object, though—it should
be generic. Therefore, you’ll use a type variable instead of an
9
actual class like str :
@dataclass_transform()
def dataclass[T](cls: type[T]) -> type[T]:
...
With the function signature out of the way, let’s think about
how to implement the decorator. You can break this into two
steps. First, you’ll need to assemble a string with the source
code of the __init__ method, using the type annotations on
the dataclass. Second, you can use Python’s built-in exec
function to evaluate that source code in the running program.
Let’s tackle the first step: assembling the source code from the
annotations (Example 10-8). Don’t fret too much about the
parameter types at this point—just use the __name__ attribute
of each parameter type, which will work in many cases.
You can now pass the source code to the exec built-in. Apart
from the source code, this function accepts dictionaries for the
global and local variables.
The canonical way to retrieve the global variables is the
globals() built-in. However, you need to evaluate the source
code in the context of the module where the class is defined,
rather than the context of your decorator. Python stores the
name of that module in the __module__ attribute of the class,
so you can look up the module object in sys.modules and
retrieve the variables from its __dict__ attribute (see “The
Module Cache”):
globals = sys.modules[cls.__module__].__dict__
@dataclass_transform()
def dataclass[T](cls: type[T]) -> type[T]:
sourcecode = build_dataclass_init(cls)
globals = sys.modules[cls.__module__].__dict__
locals = {}
exec(sourcecode, globals, locals)
( , g , )
cls.__init__ = locals["__init__"]
return cls
As you may have guessed, mypy silently passes over this issue,
because json.load returns Any . How can we make the
function type-safe? As a first step, let’s replace Any with the
JSON type you defined in “Type Aliases”:
$ py -m mypy src
error: Value of type "..." is not indexable
error: No overload variant of "__getitem__" matches a
error: Argument 1 to "Article" has incompatible type
g p yp
error: Invalid index type "str" for "JSON"; expected
error: Argument 2 to "Article" has incompatible type
Found 5 errors in 1 file (checked 1 source file)
match data:
case {"title": str(title), "extract": str(ext
return Article(title, extract)
[project]
dependencies = ["cattrs>=23.2.3"]
import cattrs
import cattrs.gen
converter = cattrs.Converter()
converter.register_structure_hook(
Article,
cattrs.gen.make_dict_structure_fn(
Article,
converter,
summary=cattrs.gen.override(rename="extract")
)
)
Dynamic code
External systems
[project]
dependencies = ["typeguard>=4.2.1"]
The checks can also be more elaborate. For example, you can
use the TypedDict construct to specify the precise shape of a
JSON object you’ve fetched from some external service, such as
the keys you expect to find and which types their associated
12
values should have:
from typing import Any, TypedDict
class Person(TypedDict):
name: str
age: int
@classmethod
def check(cls, data: Any) -> "Person":
return check_type(data, Person)
import typeguard
from typeguard import CollectionCheckStrategy
typeguard.config.collection_check_strategy = Collecti
package = "random_wikipedia_article"
@nox.session
def typeguard(session: nox.Session) -> None:
session.install(".[tests]", "typeguard")
session.run("pytest", f"--typeguard-packages={pac
Summary
Type annotations let you specify the types of variables and
functions in your source code. You can use built-in types and
user-defined classes, as well as many higher-level constructs,
such as union types, Any for gradual typing, generics, and
protocols. Stringized annotations and Self are useful for
handling forward references. The type keyword lets you
introduce type aliases.
Static type checkers like mypy leverage type annotations and
type inference to verify the type safety of your program without
running it. Mypy facilitates gradual typing by defaulting to Any
for unannotated code. You can and should enable strict mode
where possible to allow for more thorough checks. Run mypy as
part of your mandatory checks, using a Nox session for
automation.
The earlier you identify a software defect, the smaller the cost
of fixing it. In the best case, you discover issues while they’re
still in your editor—their cost is near zero. In the worst case,
you ship the bug to production. Before even starting to track
down the issue in the code, you may have to roll back the bad
deployment and contain its impact. For this reason, shift all
your checks as far to the left on that imaginary timeline as
possible.
Thank you for reading this book! While the book ends here,
your journey through the ever-shifting landscape of modern
Python developer tooling continues. Hopefully, the lessons from
this book will remain valid and helpful, as Python continues to
reinvent itself.
1
Jukka Lehtosalo, “Our Journey to Type Checking 4 Million Lines of Python”,
September 5, 2019.
2
“Specification for the Python Type System”, last accessed January 22, 2024.
3
Tin Tvrtković, “Python Is Two Languages Now, and That’s Actually Great”, February
27, 2023.
4
In a future Python version, this will work out of the box. See Larry Hastings, “PEP
649 – Deferred Evaluation of Annotations Using Descriptors”, January 11, 2021.
5
If you see an error message like “PEP 695 type aliases are not yet supported,” just
omit the type keyword for now. The type checker still interprets the assignment as a
type alias. If you want to be more explicit, you can use the typing.TypeAlias
annotation from Python 3.10 upward.
6
For brevity, I’ve removed error codes and leading directories from mypy’s output.
7
As of this writing, the upcoming release of factory-boy is expected to distribute
types inline.
8
Tim Peters, “PEP 20 – The Zen of Python”, August 19, 2004.
9
As of this writing, mypy hasn’t yet added support for PEP 695 type variables. If you
get a mypy error, type-check the code in the Pyright playground instead or use the
older TypeVar syntax.
0
In fact, the cattrs library is format-agnostic, so it doesn’t matter if you read the
raw object from JSON, YAML, TOML, or another data format.
1
If you’re interested in this topic, you absolutely should read Architecture Patterns in
Python by Harry Percival and Bob Gregory (Sebastopol: O’Reilly, 2020).
2
This is less useful than it may seem. TypedDict classes must list every field even if
you use only a subset.
3
If you call check_type directly, you’ll need to pass the
collection_check_strategy argument explicitly.
Index
A
macOS
Anaconda/Conda support, Other Linux Distributions,
Installing Python from Anaconda-Installing Python from
Anaconda
Coverage.py and, Using Coverage.py, Measuring in
Subprocesses
environments, Python modules-A look under the hood,
Wheels and Sdists
modules, Python modules-Python modules, The standard
library
Poetry and, The Lock File, Managing Environments
Python installation, Locating Python Interpreters-Locating
Python Interpreters, Installing Python on macOS-The
python.org Installers, Other Linux Distributions, Installing
Python with pyenv, An Overview of Installers-Summary, A
Tour of Python Environments-Python Installations
Python Launcher, The Python Launcher for Unix-The
Python Launcher for Unix, Finding Python Modules
Maturin, Building Packages with build
metadata, Python modules
(see also importlib-metadata; packages)
core, Wheels and Sdists
project, Python modules, Project Metadata-Dependencies
and Optional Dependencies, The Project Metadata-The
Project Metadata
MicroPython, Installing Python
Ming, Frost, Managing Projects with Poetry
Miniconda, Installing Python from Anaconda
(see also Anaconda)
Miniforge, Installing Python from Anaconda
modules, Python Environments-Python Environments,
Python modules-Python modules, Module Objects-Module
Specs, Why Packaging?-Why Packaging?
(see also specific module names)
monkey patching, Designing for Testability
mutable argument default, Linting Basics
mypy, The pyproject.toml File, Environment Markers, Linting
Basics, Using Types for Safety and Inspection, Static Type
Checking with mypy-Type Checking the Tests
O
overlay distribution, Homebrew Python