8000 feat(bzlmod)!: Calling pip multiple times allowing for multiple Python versions by chrislovecnm · Pull Request #1254 · bazel-contrib/rules_python · GitHub
[go: up one dir, main page]

Skip to content

feat(bzlmod)!: Calling pip multiple times allowing for multiple Python versions #1254

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Jun 15, 2023

Conversation

chrislovecnm
Copy link
Contributor
@chrislovecnm chrislovecnm commented Jun 6, 2023

Implementing the capability to call pip bzlmod extension multiple times. We can now call the pip extension with the same hub name multiple times. This allows a user to have multiple different requirement files based on the Python version.

With workspace, we need incompatible aliases because you get @pip//tabulate or @pip_tabulate//:pkg.
The requirement() macro hides this, but changing the flag becomes a breaking change if you've manually referenced things. With bzlmod, though, the @pip_tabulate style isn't a realistic option (you'd have to use_repo everything, which is a huge pain).

So we have chosen to have @pip//tabulate.

This commit implements that capability for pip.parse to determine the Python version from programmatically
the provided interpreter.

See

https://github.com/bazelbuild/rules_python/blob/76aab0f91bd614ee1b2ade310baf28c38695c522/python/extensions/pip.bzl#L88

For more in-depth implementation details.

INTERFACE CHANGE::

  • pip.parse argument python_version is introduced but not required. When possible, it is
    inferred.

BREAKING CHANGE:

  • pip.parse() extension: the name attribute is renamed hub_name. This is to reflect
    that the user-provided name isn't unique to each pip.parse() call. We now have a hub that is created, and each
    pip.parse call creates a spoke.
  • pip.parse() extension: the incompatible_generate_aliases arg is removed; behavior
    has changed to assume it is always True.
  • pip.parse() calls are collected under the same hub_name to support multiple Python versions under the same resulting repo name (the hub name0.

Close: #1229

@chrislovecnm chrislovecnm force-pushed the pip branch 4 times, most recently from 048ad62 to 2c58e08 Compare June 6, 2023 22:11
@chrislovecnm chrislovecnm marked this pull request as draft June 6, 2023 22:12
@rickeylev
Copy link
Collaborator

couple quick comments (only skimmed half of it thus far):

We need to keep a non-versioned compile_pip_requirements() call around. It's testing that the non-versioned variant of the rule, the one that uses the default python version, is working. Similarly, we'll want a non-versioned usage in the module config.

I see the pip extension is loading the hub? Originally we decided to avoid doing that, but I'm not strongly tied to that idea. Loading it will probably make life easier and we could mostly avoid having the interpreter extension as an intermediary if do, I think.

@rickeylev rickeylev removed the request for review from hrfuller June 6, 2023 23:07
@rickeylev rickeylev changed the title feat(bzlmod): Calling pip multiple times feat(bzlmod)!: Calling pip multiple times Jun 6, 2023
@chrislovecnm
Copy link
Contributor Author

@rickeylev we have a non versioned requirement call here: https://github.com/bazelbuild/rules_python/blob/28e15c2092a9512064e4b2a97a0fa3d1a83bc5ae/examples/bzlmod_build_file_generation/BUILD.bazel#L18

In regards to the hub dependency. We need to pass through the python default version, so I put that in the hub. I can create a different repo rule and put it in there if you like.

Also, any idea for getting the python version from the interpreter? I could not figure out how to map that.

@rickeylev
Copy link
Collaborator

In regards to the hub dependency. We need to pass through the python default version, so I put that in the hub. I can create a different repo rule and put it in there if you like.

I think we can leave it in the hub? I don't see a downside. The hub being loaded is an internal detail, so it doesn't change our API surface. The hub is small, so loading it is cheap. While the hub has some label references to the platform repos, loading the hub doesn't cause the platform repos to load, so its still cheap.

Also, any idea for getting the python version from the interpreter? I could not figure out how to map that.

Why does that need to be looked up? It'd be simple to put a label->version mapping in the hub, but I don't see why we'd use it.

@rickeylev rickeylev marked this pull request as ready for review June 7, 2023 03:52
Copy link
Collaborator
@rickeylev rickeylev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something I didn't see (it must be hidden in the pip repo rules somewhere?) is where the python-version-specific-ness of the whl_library repos are. How is that being done?

I was expecting to see something that generated a repo named e.g. "pip_tabulate_3_10", and it would have all the tabulate files in it for Python 3.10. And then "pip_tabuleate_3_9" for 3.9 etc.

But i don't see code that puts the version in any repo name. But the pip alias repo is using select() to dispatch to other repos (right?). Any idea how things are wired together in this regard?

@rickeylev
Copy link
Collaborator

[rickeylev] how is the python version-ness of whl_library working?

Ah I think I see it now: in extensions/pip.bzl, the version gets put into the pip_name variable.

@chrislovecnm
Copy link
Contributor Author

Why does that need to be looked up? It'd be simple to put a label->version mapping in the hub, but I don't see why we'd use it.

if we could look it up the user would not need to pass in a version. We need the version for creating the wheel aliases so we can do the select statement that joins hub -> pip alias -> pip

I thought it was easy, but I need to turn a canonical label back to its original label. I could not figure it out.

Copy link
Collaborator
@rickeylev rickeylev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, I wonder if part of the solution to the multi-pip issue is to not load e.g. @pip_39//:requirements.bzl, but instead load @pip//3.9:requirements.bzl. Similar to what is done with @python_aliases.

@rickeylev
Copy link
Collaborator

I was going to ask to remove the other bzlmod template because it's unused...but no, I don't think it actually is?

So, just to make sure I understand what's going on here. Please correct me where I'm wrong.

  • @pip is the bzlmod pip hub. It has aliases to the whl alias hubs (@pip_pylint).
  • @pip_{38,39} are the regular pip "aliases" hubs (i.e. the incompatible_generate_aliases=True thing). Each points to their particular @pip_{ver}_pylint repo. I think these are only of interest because they have a working entry point function?
  • @pip_pylint is the whl alias hub. It has select-based aliases pointing to the @pip_{ver}_pylint repos.
  • @pip_38_pylint is a whl_library. Finally, something concrete.

Is the whl alias intermediary really necessary? I don't see why the bzlmod pip hub can't point directly to the underlying wheels. I'm fine with addressing this in a separate PR.

I think the regular pip hubs, e.g. @pip_38, are only around because they have a working entry point function, right? That's fine if so; just making sure I understand why they're here.

@chrislovecnm
Copy link
Contributor Author
chrislovecnm commented Jun 13, 2023

Something I didn't see (it must be hidden in the pip repo rules somewhere?) is where the python-version-specific-ness of the whl_library repos are. How is that being done?

I was expecting to see something that generated a repo named e.g. "pip_tabulate_3_10", and it would have all the tabulate files in it for Python 3.10. And then "pip_tabuleate_3_9" for 3.9 etc.

But i don't see code that puts the version in any repo name. But the pip alias repo is using select() to dispatch to other repos (right?). Any idea how things are wired together in this regard?

So because I am reusing how pip and multiple work, we are getting those repositories. We have a pip-hub -> pip-whl-alias (which has the selects) -> pip versioned whl.

cc: @rickeylev

@chrislovecnm
Copy link
Contributor Author
chrislovecnm commented Jun 13, 2023

I was going to ask to remove the other bzlmod template because it's unused...but no, I don't think it actually is?

So, just to make sure I understand what's going on here. Please correct me where I'm wrong.

  • @pip is the bzlmod pip hub. It has aliases to the whl alias hubs (@pip_pylint).
  • @pip_{38,39} are the regular pip "aliases" hubs (i.e. the incompatible_generate_aliases=True thing). Each points to their particular @pip_{ver}_pylint repo. I think these are only of interest because they have a working entry point function?
  • @pip_pylint is the whl alias hub. It has select-based aliases pointing to the @pip_{ver}_pylint repos.
  • @pip_38_pylint is a whl_library. Finally, something concrete.

Is the whl alias intermediary really necessary? I don't see why the bzlmod pip hub can't point directly to the underlying wheels. I'm fine with addressing this in a separate PR.

I think the regular pip hubs, e.g. @pip_38, are only around because they have a working entry point function, right? That's fine if so; just making sure I understand why they're here

We can probably improve the code more, but I would like to do that by iterating on the code. This is how, currently, multiple Python versions work, with the addition of a hub. So I wanted to keep the code as close as possible.

We need a hub -> alias (with the selects) -> Python versioned pip. I'm guessing that we could combine these some to reduce the number of repos, but I wanted to get something out that I can guarantee works. Just getting this working took a bunch of time.

Take a read of the documentation that I updated here: https://github.com/bazelbuild/rules_python/blob/76aab0f91bd614ee1b2ade310baf28c38695c522/python/extensions/pip.bzl#L88

cc: @rickeylev

@chrislovecnm chrislovecnm force-pushed the pip branch 5 times, most recently from 00e7ef8 to a60f26c Compare June 13, 2023 19:50
Implementing the capability to call pip bzlmod extension multiple times.
We can now call the pip extension with the same hub name multiple times.
This allows a user to have multiple different requirement files based on the Python version.

feat(bzlmod): removing incompatible aliases flag for bzlmod

With workspace, we need incompible aliases, because you get @pip//tabulate or @pip_tabulate//:pkg.
The requirement() macro hides this, but if you're manually referencing things, then changing the
flag ends up being a breaking change With bzlmod, though, the @pip_tabulate style isn't a
realistic option (you'd have to use_repo everything, a huge pain).

So we have choosen to always have `@pip//tabulate`.

feat(bzlmod): Implement automatic python version detection

This commit implements that capability for pip.parse to
programtically determine the python version from
the provided interpreter.
@chrislovecnm chrislovecnm changed the title feat(bzlmod)!: Calling pip multiple times feat(bzlmod)!: Calling pip multiple times allowing for multiple Python versions Jun 13, 2023
@chrislovecnm chrislovecnm added the type: bzlmod bzlmod work label Jun 13, 2023
@chrislovecnm
Copy link
Contributor Author

Hm, I wonder if part of the solution to the multi-pip issue is to not load e.g. @pip_39//:requirements.bzl, but instead load @pip//3.9:requirements.bzl. Similar to what is done with @python_aliases.

For what @rickeylev? For entrypoint?

@chrislovecnm
Copy link
Contributor Author

@aignas do you mind taking a look? This is another big PR and I would love another set of eyes.

Copy link
Collaborator
@aignas aignas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From my first pass, everything looks reasonable, will take a look later once more but feel free to merge before I do so.

- @@rules_python~override~pip~pip_310

The different spoke names are a combination of the hub_name and the Python version.
In the future we may remove this repository, but we do not support endpoints
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is endpoints?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo, it should be "entry points"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

requirements_lock = attr.requirements_lock,
incompatible_generate_aliases = attr.incompatible_generate_aliases,
)
for pip_attr in mod.tags.parse:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be great to move some of this code into a separate file and keep your long docstring next to it. The user of this extension does not necessarily ned to know all of the info about these complexities.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's fine. The doc string is on the impl function, no the doc-arg of the extension_module() call, so it won't show up to users unless they go looking through the source.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, another reason for splitting could be the indentation level of the code below, as it is getting a quite left-padded, but it's fine to merge as is.

Copy link
Collaborator
@rickeylev rickeylev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

impl mostly looks good, but a couple questions still

def entry_point(pkg, script = None):
"""entry_point returns the target of the canonical label of the package entrypoints.
"""
print("not implemented")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
print("not implemented")
# TODO: https://github.com/bazelbuild/rules_python/issues/1262
print("not implemented")

- @@rules_python~override~pip~pip_310

The different spoke names are a combination of the hub_name and the Python version.
In the future we may remove this repository, but we do not support endpoints
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo, it should be "entry points"

requirements_lock = attr.requirements_lock,
incompatible_generate_aliases = attr.incompatible_generate_aliases,
)
for pip_attr in mod.tags.parse:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's fine. The doc string is on the impl function, no the doc-arg of the extension_module() call, so it won't show up to users unless they go looking through the source.

# interpreter, because of the standard naming that we use when
# downloading Python binaries.
python_version = ""
if pip_attr.python_version == "" and pip_attr.python_interpreter_target != "":
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The pip_attr.python_interpreter_target != "" expression will always be true -- the input is a label, which is either a Label object or None, neither of which will be equal to "".

I think you meant != None?

if DEFAULT_PYTHON_VERSION not in version_map:
fail(
"""
Unable to find the default python version in the version map, please update your requirements files
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This check here will be problematic, but we can deal with it in a followup PR.

The two cases I'm think of are:

If the root module sets the default to X, and a submodule is only configured for Y, then everything breaks.

When rules_python upgrades the default version, everyone's pip.parse calls will break.

What we'll need is a way to have pip.parse use the default python. I've filed this with more thoughts: #1267

incompatible_generate_aliases = attr.incompatible_generate_aliases,
)
for pip_attr in mod.tags.parse:
# We can automatically dertermine the Python version of an
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not against inferring the version from the name per se (we're matching our internal names, so its fine), and I'm OK with this right now.

That said, I'm leaning more and more towards the idea of using python_version to act as the "driver" of things.

  • (Ignas's idea) We can easily map the version to the backing repository and interpreter. Now users don't need to specify the interpreter target or use the interpreter extension.
  • With Phil's work to uncouple the pip dependency processing from the interpreter used, we would no longer need the interpreter, but would still need the python version.
  • Allowing multi-version requirements files using env markers requires us to know the intended python version; knowing the interpreter doesn't really help.

Anyways, fine for now. I filed #1268 for more thoughts here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not against inferring the version from the name per se (we're matching our internal names, so its fine), and I'm OK with this right now.

That said, I'm leaning more and more towards the idea of using python_version to act as the "driver" of things.

  • (Ignas's idea) We can easily map the version to the backing repository and interpreter. Now users don't need to specify the interpreter target or use the interpreter extension.
  • With Phil's work to uncouple the pip dependency processing from the interpreter used, we would no longer need the interpreter, but would still need the python version.
  • Allowing multi-version requirements files using env markers requires us to know the intended python version; knowing the interpreter doesn't really help.

Anyways, fine for now. I filed #1268 for more thoughts here.

Not having the interpreter repo may cause some challenges with the dependencies. We would need to test it. We could also write out the sym link in the python_aliases repo.

There are use cases where the user won’t want an interpreter, so I am wondering how we do that.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not having the interpreter repo may cause some challenges with the dependencies.

How so?

There are use cases where the user won’t want an interpreter, so I am wondering how we do that.

I'm guessing you mean a user wanting to use a local system python? The solution for that is for the user to specify the interpreter using the pip.parse python_interpreter arg (this accepts a path to an arbitrary interpreter).

If you mean something else, I'm not sure what you mean -- an interpreter is required in some fashion because we have to eventually run various commands using it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interfere with some of the repositories loading properly.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't follow, but lets continue discussion on #1268

for version in MINOR_MAPPING.keys():
# This will create a value like
# _3_10_x86_64-unknown-linux-gnu
python_platform_version = "_{version}_{platform}".format(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We control the repo name, so lets just use its full apparent name when matching.

Suggested change
python_platform_version = "_{version}_{platform}".format(
python_platform_version = "python_{version}_{platform}".format(

We are removing the interpreter extension and adding the
funcationality into pip.  A user still can provide the
interpreter_target label to override this funcationality.
@chrislovecnm chrislovecnm added this pull request to the merge queue Jun 15, 2023
Merged via the queue into bazel-contrib:main with commit 1c58124 Jun 15, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type: bzlmod bzlmod work
Projects
None yet
Development

Successfully merging this pull request may close these issues.

bzlmod Multiple Python Versions
3 participants
0