8000 Incrementally download wheels at workspace time. by hrfuller · Pull Request #432 · bazel-contrib/rules_python · GitHub
[go: up one dir, main page]

Skip to content

Incrementally download wheels at workspace time. #432

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 14 commits into from
Mar 22, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8000
4 changes: 2 additions & 2 deletions .bazelrc
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
# This lets us glob() up all the files inside the examples to make them inputs to tests
# (Note, we cannot use `common --deleted_packages` because the bazel version command doesn't support it)
# To update these lines, run tools/bazel_integration_test/update_deleted_packages.sh
build --deleted_packages=examples/legacy_pip_import/boto,examples/legacy_pip_import/extras,examples/legacy_pip_import/helloworld,examples/pip_install
query --deleted_packages=examples/legacy_pip_import/boto,examples/legacy_pip_import/extras,examples/legacy_pip_import/helloworld,examples/pip_install
build --deleted_packages=examples/legacy_pip_import/boto,examples/legacy_pip_import/extras,examples/legacy_pip_import/helloworld,examples/pip_install,examples/pip_parse
query --deleted_packages=examples/legacy_pip_import/boto,examples/legacy_pip_import/extras,examples/legacy_pip_import/helloworld,examples/pip_install,examples/pip_parse

test --test_output=errors
4 changes: 4 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -37,3 +37,7 @@ bazel-bin
bazel-genfiles
bazel-out
bazel-testlogs

# vim swap files
*.swp
*.swo
36 changes: 35 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -105,7 +105,7 @@ target in the appropriate wheel repo.

### Importing `pip` dependencies

To add pip dependencies to your `WORKSPACE` is you load
To add pip dependencies to your `WORKSPACE` load
the `pip_install` function, and call it to create the
individual wheel repos.

Expand Down Expand Up @@ -136,6 +136,40 @@ re-executed in order to pick up a non-hermetic change to your environment (e.g.,
updating your system `python` interpreter), you can completely flush out your
repo cache with `bazel clean --expunge`.

### Fetch `pip` dependencies lazily (experimental)

One pain point with `pip_install` is the need to download all dependencies resolved by
your requirements.txt before the bazel analysis phase can start. For large python monorepos
this can take a long time, especially on slow connections.

`pip_parse` provides a solution to this problem. If you can provide a lock
file of all your python dependencies `pip_parse` will translate each requirement into its own external repository.
Bazel will only fetch/build wheels for the requirements in the subgraph of your build target.

There are API differences between `pip_parse` and `pip_install`:
1. `pip_parse` requires a fully resolved lock file of your python dependencies. You can generate this using
`pip-compile`, or a virtualenv and `pip freeze`. `pip_parse` uses a label argument called `requirements_lock` instead of `requirements`
to make this distinction clear.
2. `pip_parse` translates your requirements into a starlark macro called `install_deps`. You must call this macro in your WORKSPACE to
declare your dependencies.


```python
load("@rules_python//python:pip.bzl", "pip_parse")

# Create a central repo that knows about the dependencies needed from
# requirements_lock.txt.
pip_parse(
name = "my_deps",
requirements_lock = "//path/to:requirements_lock.txt",
)

# Load the starlark macro which will define your dependencies.
load("@my_deps//:requirements.bzl", "install_deps")
# Call it to define repos for your requirements.
install_deps()
```

### Importing `pip` dependencies with `pip_import` (legacy)

The deprecated `pip_import` can still be used if needed.
Expand Down
5 changes: 5 additions & 0 deletions examples/BUILD
67E6
Original file line number Diff line number Diff line change
Expand Up @@ -26,3 +26,8 @@ bazel_integration_test(
name = "pip_install_example",
timeout = "long",
)

bazel_integration_test(
name = "pip_parse_example",
timeout = "long",
)
42 changes: 42 additions & 0 deletions examples/pip_parse/BUILD
Original file line number Diff line number Diff line change
@@ -0,0 +1,42 @@
load("@pip_parsed_deps//:requirements.bzl", "requirement")
load("@rules_python//python:defs.bzl", "py_binary", "py_test")

# Toolchain setup, this is optional.
# Demonstrate that we can use the same python interpreter for the toolchain and executing pip in pip install (see WORKSPACE).
#
#load("@rules_python//python:defs.bzl", "py_runtime_pair")
#
#py_runtime(
# name = "python3_runtime",
# files = ["@python_interpreter//:files"],
# interpreter = "@python_interpreter//:python_bin",
# python_version = "PY3",
# visibility = ["//visibility:public"],
#)
#
#py_runtime_pair(
# name = "my_py_runtime_pair",
# py2_runtime = None,
# py3_runtime = ":python3_runtime",
#)
#
#toolchain(
# name = "my_py_toolchain",
# toolchain = ":my_py_runtime_pair",
# toolchain_type = "@bazel_tools//tools/python:toolchain_type",
#)
# End of toolchain setup.

py_binary(
name = "main",
srcs = ["main.py"],
deps = [
requirement("requests"),
],
)

py_test(
name = "test",
srcs = ["test.py"],
deps = [":main"],
)
39 changes: 39 additions & 0 deletions examples/pip_parse/WORKSPACE
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
workspace(name = "example_repo")

load("@bazel_tools//tools/build_defs/repo:http.bzl", "http_archive")

http_archive(
name = "rules_python",
url = "https://github.com/bazelbuild/rules_python/releases/download/0.1.0/rules_python-0.1.0.tar.gz",
sha256 = "b6d46438523a3ec0f3cead544190ee13223a52f6a6765a29eae7b7cc24cc83a0",
)

load("@rules_python//python:pip.bzl", "pip_parse")

pip_parse(
# (Optional) You can provide extra parameters to pip.
# Here, make pip output verbose (this is usable with `quiet = False`).
# extra_pip_args = ["-v"],

# (Optional) You can exclude custom elements in the data section of the generated BUILD files for pip packages.
# Exclude directories with spaces in their names in this example (avoids build errors if there are such directories).
#pip_data_exclude = ["**/* */**"],

# (Optional) You can provide a python_interpreter (path) or a python_interpreter_target (a Bazel target, that
# acts as an executable). The latter can be anything that could be used as Python interpreter. E.g.:
# 1. Python interpreter that you compile in the build file (as above in @python_interpreter).
# 2. Pre-compiled python interpreter included with http_archive
# 3. Wrapper script, like in the autodetecting python toolchain.
#python_interpreter_target = "@python_interpreter//:python_bin",

# (Optional) You can set quiet to False if you want to see pip output.
#quiet = False,

# Uses the default repository name "pip_incremental"
requirements_lock = "//:requirements_lock.txt",
)

load("@pip_parsed_deps//:requirements.bzl", "install_deps")

# Initialize repositories for all packages in requirements_lock.txt.
install_deps()
5 changes: 5 additions & 0 deletions examples/pip_parse/main.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
import requests


def version():
return requests.__version__
1 change: 1 addition & 0 deletions examples/pip_parse/requirements.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
requests==2.24.0
16 changes: 16 additions & 0 deletions examples/pip_parse/requirements_lock.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
#
# This file is autogenerated by pip-compile
# To update, run:
#
# pip-compile --output-file=requirements_lock.txt requirements.txt
#
certifi==2020.12.5
# via requests
chardet==3.0.4
# via requests
idna==2.10
# via requests
requests==2.24.0
# via -r requirements.txt
urllib3==1.25.11
# via requests
11 changes: 11 additions & 0 deletions examples/pip_parse/test.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
import unittest
import main


class ExampleTest(unittest.TestCase):
def test_main(self):
self.assertEqual("2.24.0", main.version())


if __name__ == '__main__':
unittest.main()
11 changes: 11 additions & 0 deletions python/pip.bzl
Original file line number Diff line number Diff line change
Expand Up @@ -56,6 +56,17 @@ def pip_install(requirements, name = "pip", **kwargs):
**kwargs
)

def pip_parse(requirements_lock, name = "pip_parsed_deps", **kwargs):
# Just in case our dependencies weren't already fetched
pip_install_dependencies()

pip_repository(
name = name,
requirements_lock = requirements_lock,
incremental = True,
**kwargs
)

def pip_repositories():
# buildifier: disable=print
print("DEPRECATED: the pip_repositories rule has been replaced with pip_install, please see rules_python 0.1 release notes")
Expand Down
1 change: 1 addition & 0 deletions python/pip_install/BUILD
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@ filegroup(
srcs = glob(["*.bzl"]) + [
"BUILD",
"//python/pip_install/extract_wheels:distribution",
"//python/pip_install/parse_requirements_to_bzl:distribution",
],
visibility = ["//:__pkg__"],
)
Expand Down
28 changes: 6 additions & 22 deletions python/pip_install/extract_wheels/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@
import sys
import json

from python.pip_install.extract_wheels.lib import bazel, requirements
from python.pip_install.extract_wheels.lib import bazel, requirements, arguments


def configure_reproducible_wheels() -> None:
Expand Down Expand Up @@ -58,25 +58,7 @@ def main() -> None:
required=True,
help="Path to requirements.txt from where to install dependencies",
)
parser.add_argument(
"--repo",
action="store",
required=True,
help="The external repo name to install dependencies. In the format '@{REPO_NAME}'",
)
parser.add_argument(
"--extra_pip_args", action="store", help="Extra arguments to pass down to pip.",
)
parser.add_argument(
"--pip_data_exclude",
action="store",
help="Additional data exclusion parameters to add to the pip packages BUILD file.",
)
parser.add_argument(
"--enable_implicit_namespace_pkgs",
action="store_true",
help="Disables conversion of implicit namespace packages into pkg-util style packages.",
)
arguments.parse_common_args(parser)
args = parser.parse_args()

pip_args = [sys.executable, "-m", "pip", "--isolated", "wheel", "-r", args.requirements]
Expand All @@ -93,10 +75,12 @@ def main() -> None:
else:
pip_data_exclude = []

repo_label = "@%s" % args.repo

targets = [
'"%s%s"'
% (
args.repo,
repo_label,
bazel.extract_wheel(
whl, extras, pip_data_exclude, args.enable_implicit_namespace_pkgs
),
Expand All @@ -106,5 +90,5 @@ def main() -> None:

with open("requirements.bzl", "w") as requirement_file:
requirement_file.write(
bazel.generate_requirements_file_contents(args.repo, targets)
bazel.generate_requirements_file_contents(repo_label, targets)
)
19 changes: 18 additions & 1 deletion python/pip_install/extract_wheels/lib/BUILD
Original file line number Diff line number Diff line change
Expand Up @@ -9,8 +9,12 @@ py_library(
"purelib.py",
"requirements.py",
"wheel.py",
"arguments.py",
],
visibility = [
"//python/pip_install/extract_wheels:__subpackages__",
"//python/pip_install/parse_requirements_to_bzl:__subpackages__",
],
visibility = ["//python/pip_install/extract_wheels:__subpackages__"],
deps = [
requirement("pkginfo"),
requirement("setuptools"),
Expand Down Expand Up @@ -41,6 +45,19 @@ py_test(
],
)

py_test(
name = "arguments_test",
size = "small",
srcs = [
"arguments_test.py",
],
tags = ["unit"],
deps = [
":lib",
"//python/pip_install/parse_requirements_to_bzl:lib",
],
)

py_test(
name = "whl_filegroup_test",
size = "small",
Expand Down
24 changes: 24 additions & 0 deletions python/pip_install/extract_wheels/lib/arguments.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
from argparse import ArgumentParser


def parse_common_args(parser: ArgumentParser) -> ArgumentParser:
parser.add_argument(
"--repo",
action="store",
required=True,
help="The external repo name to install dependencies. In the format '@{REPO_NAME}'",
)
parser.add_argument(
"--extra_pip_args", action="store", help="Extra arguments to pass down to pip.",
)
parser.add_argument(
"--pip_data_exclude",
action="store",
help="Additional data exclusion parameters to add to the pip packages BUILD file.",
)
parser.add_argument(
"--enable_implicit_namespace_pkgs",
action="store_true",
help="Disables conversion of implicit namespace packages into pkg-util style packages.",
)
return parser
27 changes: 27 additions & 0 deletions python/pip_install/extract_wheels/lib/arguments_test.py
< 6EA8 td id="diff-6c1bebf4cdf069284f914d65b050aa3ad18253fd70c82542fd44579d4ce8c5bcR8" data-line-number="8" class="blob-num blob-num-addition js-linkable-line-number js-blob-rnum">
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
import argparse
import json
import unittest

from python.pip_install.extract_wheels.lib import arguments
from python.pip_install.parse_requirements_to_bzl import deserialize_structured_args


class ArgumentsTestCase(unittest.TestCase):
def test_arguments(self) -> None:
parser = argparse.ArgumentParser()
parser = arguments.parse_common_args(parser)
repo_name = "foo"
index_url = "--index_url=pypi.org/simple"
args_dict = vars(parser.parse_args(
args=["--repo", repo_name, "--extra_pip_args={index_url}".format(index_url=json.dumps({"args": index_url}))]))
args_dict = deserialize_structured_args(args_dict)
self.assertIn("repo", args_dict)
self.assertIn("extra_pip_args", args_dict)
self.assertEqual(args_dict["pip_data_exclude"], None)
self.assertEqual(args_dict["enable_implicit_namespace_pkgs"], False)
self.assertEqual(args_dict["repo"], repo_name)
self.assertEqual(args_dict["extra_pip_args"], index_url)


if __name__ == "__main__":
unittest.main()
Loading
0