-
-
Notifications
You must be signed in to change notification settings - Fork 598
feat(bzlmod)!: Calling pip multiple times allowing for multiple Python versions #1254
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
048ad62
to
2c58e08
Compare
couple quick comments (only skimmed half of it thus far): We need to keep a non-versioned compile_pip_requirements() call around. It's testing that the non-versioned variant of the rule, the one that uses the default python version, is working. Similarly, we'll want a non-versioned usage in the module config. I see the pip extension is loading the hub? Originally we decided to avoid doing that, but I'm not strongly tied to that idea. Loading it will probably make life easier and we could mostly avoid having the interpreter extension as an intermediary if do, I think. |
@rickeylev we have a non versioned requirement call here: https://github.com/bazelbuild/rules_python/blob/28e15c2092a9512064e4b2a97a0fa3d1a83bc5ae/examples/bzlmod_build_file_generation/BUILD.bazel#L18 In regards to the hub dependency. We need to pass through the python default version, so I put that in the hub. I can create a different repo rule and put it in there if you like. Also, any idea for getting the python version from the interpreter? I could not figure out how to map that. |
I think we can leave it in the hub? I don't see a downside. The hub being loaded is an internal detail, so it doesn't change our API surface. The hub is small, so loading it is cheap. While the hub has some label references to the platform repos, loading the hub doesn't cause the platform repos to load, so its still cheap.
Why does that need to be looked up? It'd be simple to put a label->version mapping in the hub, but I don't see why we'd use it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Something I didn't see (it must be hidden in the pip repo rules somewhere?) is where the python-version-specific-ness of the whl_library
repos are. How is that being done?
I was expecting to see something that generated a repo named e.g. "pip_tabulate_3_10", and it would have all the tabulate files in it for Python 3.10. And then "pip_tabuleate_3_9" for 3.9 etc.
But i don't see code that puts the version in any repo name. But the pip alias repo is using select() to dispatch to other repos (right?). Any idea how things are wired together in this regard?
python/pip_install/multiple_pip_repository_requirements_bzlmod.bzl.tmpl
Outdated
Show resolved
Hide resolved
Ah I think I see it now: in extensions/pip.bzl, the version gets put into the |
if we could look it up the user would not need to pass in a version. We need the version for creating the wheel aliases so we can do the select statement that joins hub -> pip alias -> pip I thought it was easy, but I need to turn a canonical label back to its original label. I could not figure it out. |
584927d
to
5a96521
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, I wonder if part of the solution to the multi-pip issue is to not load e.g. @pip_39//:requirements.bzl
, but instead load @pip//3.9:requirements.bzl
. Similar to what is done with @python_aliases
.
I was going to ask to remove the other bzlmod template because it's unused...but no, I don't think it actually is? So, just to make sure I understand what's going on here. Please correct me where I'm wrong.
Is the whl alias intermediary really necessary? I don't see why the bzlmod pip hub can't point directly to the underlying wheels. I'm fine with addressing this in a separate PR. I think the regular pip hubs, e.g. |
So because I am reusing how pip and multiple work, we are getting those repositories. We have a pip-hub -> pip-whl-alias (which has the selects) -> pip versioned whl. cc: @rickeylev |
We can probably improve the code more, but I would like to do that by iterating on the code. This is how, currently, multiple Python versions work, with the addition of a hub. So I wanted to keep the code as close as possible. We need a hub -> alias (with the selects) -> Python versioned pip. I'm guessing that we could combine these some to reduce the number of repos, but I wanted to get something out that I can guarantee works. Just getting this working took a bunch of time. Take a read of the documentation that I updated here: https://github.com/bazelbuild/rules_python/blob/76aab0f91bd614ee1b2ade310baf28c38695c522/python/extensions/pip.bzl#L88 cc: @rickeylev |
python/pip_install/multiple_pip_repository_requirements_bzlmod.bzl.tmpl
Outdated
Show resolved
Hide resolved
00e7ef8
to
a60f26c
Compare
Implementing the capability to call pip bzlmod extension multiple times. We can now call the pip extension with the same hub name multiple times. This allows a user to have multiple different requirement files based on the Python version. feat(bzlmod): removing incompatible aliases flag for bzlmod With workspace, we need incompible aliases, because you get @pip//tabulate or @pip_tabulate//:pkg. The requirement() macro hides this, but if you're manually referencing things, then changing the flag ends up being a breaking change With bzlmod, though, the @pip_tabulate style isn't a realistic option (you'd have to use_repo everything, a huge pain). So we have choosen to always have `@pip//tabulate`. feat(bzlmod): Implement automatic python version detection This commit implements that capability for pip.parse to programtically determine the python version from the provided interpreter.
For what @rickeylev? For entrypoint? |
@aignas do you mind taking a look? This is another big PR and I would love another set of eyes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From my first pass, everything looks reasonable, will take a look later once more but feel free to merge before I do so.
python/extensions/pip.bzl
Outdated
- @@rules_python~override~pip~pip_310 | ||
|
||
The different spoke names are a combination of the hub_name and the Python version. | ||
In the future we may remove this repository, but we do not support endpoints |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is endpoints?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo, it should be "entry points"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
https://github.com/bazelbuild/rules_python/blob/main/examples/bzlmod/entry_point/test_entry_point.py See that. I am not real familiar with them.
requirements_lock = attr.requirements_lock, | ||
incompatible_generate_aliases = attr.incompatible_generate_aliases, | ||
) | ||
for pip_attr in mod.tags.parse: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be great to move some of this code into a separate file and keep your long docstring next to it. The user of this extension does not necessarily ned to know all of the info about these complexities.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's fine. The doc string is on the impl function, no the doc-arg of the extension_module() call, so it won't show up to users unless they go looking through the source.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, another reason for splitting could be the indentation level of the code below, as it is getting a quite left-padded, but it's fine to merge as is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
impl mostly looks good, but a couple questions still
def entry_point(pkg, script = None): | ||
"""entry_point returns the target of the canonical label of the package entrypoints. | ||
""" | ||
print("not implemented") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
print("not implemented") | |
# TODO: https://github.com/bazelbuild/rules_python/issues/1262 | |
print("not implemented") |
python/extensions/pip.bzl
Outdated
- @@rules_python~override~pip~pip_310 | ||
|
||
The different spoke names are a combination of the hub_name and the Python version. | ||
In the future we may remove this repository, but we do not support endpoints |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo, it should be "entry points"
requirements_lock = attr.requirements_lock, | ||
incompatible_generate_aliases = attr.incompatible_generate_aliases, | ||
) | ||
for pip_attr in mod.tags.parse: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's fine. The doc string is on the impl function, no the doc-arg of the extension_module() call, so it won't show up to users unless they go looking through the source.
python/extensions/pip.bzl
Outdated
# interpreter, because of the standard naming that we use when | ||
# downloading Python binaries. | ||
python_version = "" | ||
if pip_attr.python_version == "" and pip_attr.python_interpreter_target != "": |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The pip_attr.python_interpreter_target != ""
expression will always be true -- the input is a label, which is either a Label object or None, neither of which will be equal to "".
I think you meant != None
?
if DEFAULT_PYTHON_VERSION not in version_map: | ||
fail( | ||
""" | ||
Unable to find the default python version in the version map, please update your requirements files |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This check here will be problematic, but we can deal with it in a followup PR.
The two cases I'm think of are:
If the root module sets the default to X, and a submodule is only configured for Y, then everything breaks.
When rules_python upgrades the default version, everyone's pip.parse calls will break.
What we'll need is a way to have pip.parse use the default python. I've filed this with more thoughts: #1267
python/extensions/pip.bzl
Outdated
incompatible_generate_aliases = attr.incompatible_generate_aliases, | ||
) | ||
for pip_attr in mod.tags.parse: | ||
# We can automatically dertermine the Python version of an |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not against inferring the version from the name per se (we're matching our internal names, so its fine), and I'm OK with this right now.
That said, I'm leaning more and more towards the idea of using python_version
to act as the "driver" of things.
- (Ignas's idea) We can easily map the version to the backing repository and interpreter. Now users don't need to specify the interpreter target or use the interpreter extension.
- With Phil's work to uncouple the pip dependency processing from the interpreter used, we would no longer need the interpreter, but would still need the python version.
- Allowing multi-version requirements files using env markers requires us to know the intended python version; knowing the interpreter doesn't really help.
Anyways, fine for now. I filed #1268 for more thoughts here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not against inferring the version from the name per se (we're matching our internal names, so its fine), and I'm OK with this right now.
That said, I'm leaning more and more towards the idea of using
python_version
to act as the "driver" of things.
- (Ignas's idea) We can easily map the version to the backing repository and interpreter. Now users don't need to specify the interpreter target or use the interpreter extension.
- With Phil's work to uncouple the pip dependency processing from the interpreter used, we would no longer need the interpreter, but would still need the python version.
- Allowing multi-version requirements files using env markers requires us to know the intended python version; knowing the interpreter doesn't really help.
Anyways, fine for now. I filed #1268 for more thoughts here.
Not having the interpreter repo may cause some challenges with the dependencies. We would need to test it. We could also write out the sym link in the python_aliases repo.
There are use cases where the user won’t want an interpreter, so I am wondering how we do that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not having the interpreter repo may cause some challenges with the dependencies.
How so?
There are use cases where the user won’t want an interpreter, so I am wondering how we do that.
I'm guessing you mean a user wanting to use a local system python? The solution for that is for the user to specify the interpreter using the pip.parse python_interpreter
arg (this accepts a path to an arbitrary interpreter).
If you mean something else, I'm not sure what you mean -- an interpreter is required in some fashion because we have to eventually run various commands using it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interfere with some of the repositories loading properly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't follow, but lets continue discussion on #1268
python/extensions/pip.bzl
Outdated
for version in MINOR_MAPPING.keys(): | ||
# This will create a value like | ||
# _3_10_x86_64-unknown-linux-gnu | ||
python_platform_version = "_{version}_{platform}".format( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We control the repo name, so lets just use its full apparent name when matching.
python_platform_version = "_{version}_{platform}".format( | |
python_platform_version = "python_{version}_{platform}".format( |
We are removing the interpreter extension and adding the funcationality into pip. A user still can provide the interpreter_target label to override this funcationality.
Implementing the capability to call pip bzlmod extension multiple times. We can now call the pip extension with the same hub name multiple times. This allows a user to have multiple different requirement files based on the Python version.
With workspace, we need incompatible aliases because you get @pip//tabulate or @pip_tabulate//:pkg.
The requirement() macro hides this, but changing the flag becomes a breaking change if you've manually referenced things. With bzlmod, though, the @pip_tabulate style isn't a realistic option (you'd have to use_repo everything, which is a huge pain).
So we have chosen to have
@pip//tabulate
.This commit implements that capability for pip.parse to determine the Python version from programmatically
the provided interpreter.
See
https://github.com/bazelbuild/rules_python/blob/76aab0f91bd614ee1b2ade310baf28c38695c522/python/extensions/pip.bzl#L88
For more in-depth implementation details.
INTERFACE CHANGE::
inferred.
BREAKING CHANGE:
pip.parse()
extension: thename
attribute is renamedhub_name
. This is to reflectthat the user-provided name isn't unique to each
pip.parse()
call. We now have a hub that is created, and eachpip.parse call creates a spoke.
pip.parse()
extension: theincompatible_generate_aliases
arg is removed; behaviorhas changed to assume it is always True.
pip.parse()
calls are collected under the samehub_name
to support multiple Python versions under the same resulting repo name (the hub name0.Close: #1229