8000 Bazel breaks python import with __init__.py · Issue #55 · bazel-contrib/rules_python · GitHub
[go: up one dir, main page]

Skip to content
Bazel breaks python import with __init__.py #55
Closed
@mouadino

Description

@mouadino

System information:

  • Bazel version: 0.9.0
  • Python 3.6
  • Operation System: Debian 9.0
  • Context Runner: Docker

Description

I have few examples of how Bazel introduce __init__.py in places where none was and by doing this it breaks python import, e.g.

Python dependencies that are package namespaces.

An example is zope.interface package, which is a dependency of pyramid library same for zope.deprecation, we have some code that both when tests and one code is run it fails with ModuleNotFoundError: No module named 'zope.interface'.

I spend some time investigating this issue and the result of my investigation lead to two conclusions:

  • Bazel introduce __init__.py files and break namespacing
  • Python namespaces wheels installed in no-site directories wouldn't work

Let's looks at each one of them in details:

Bazel introduce __init__.py files and break namespacing

Looking at what wheel contains (using unzip -l ....whl) and how the directory is layout is in the .cache/bazel/<...>/external/pypi__zope_interface_4_4_0/, we see the same structure:

+ zope
    | + interface
          | - __init__.py
          | - _compat.py
          | ....
- zope.interface-4.4.0-py3.6-nspkg.pth
+ zope.interface-4.4.0.dist-info

While if you look inside the .runfiles of a py_binary or py_test rule you will see some __init__.py files being introduced e.g. in ls bazel-out/k8-fastbuild/bin/<some-repository-path>/default_test.runfiles/pypi__zope_interface_4_4_0/

- __init__.py    <<<<<<<<<<<<<<<<<<
+ zope
  | - __init__.py    <<<<<<<<<<<<<<<<<<<<<
   | + interface
         | - __init__.py
         | - _compat.py
         | ....
- zope.interface-4.4.0-py3.6-nspkg.pth
+ zope.interface-4.4.0.dist-info

Those __init__.py break the logic inside .pth that make sure that a package namespace is created.

Python namespaces wheels installed in no-site directories wouldn't work

The above is not all the story, because even if you remove __init__,py files, still namespace packages like zope.interface will not work as they should be b/c this later relay on the fact that the .pth files are executed which patch python module's __path__ of the given package (and that no __init__.py file exists in the subdirectory hence the first issue), however .pth files (like zope.interface-4.4.0-py3.6-nspkg.pth) are only parsed if the enclosing directory is considered a site directory as mentioned by documentation here:

A path configuration file is a file whose name has the form name.pth and exists in one of the four directories mentioned above; ...

Which refer to the paragraph before where it said.

It starts by constructing up to four directories from a head and a tail part. For the head part, it uses sys.prefix and sys.exec_prefix; empty heads are skipped. For the tail part, it uses the empty string and then lib/site-packages (on Windows) or lib/pythonX.Y/site-packages (on Unix and Macintosh)

Now long story, short, the .pth files are not parsed just by adding a directory to PYTHONPATH which is what the runner here, instead you will want to do add something like:

import site

for dir in python_imports.split(':'):
    if any(fpath.endswith('.pth') for fpath in os.listdir(dir)):
       site.addsitedir(dir)

Running tests with pytest

We are using pytest as a test runner, but for some of our existing code, switching to Bazel is breaking the tests, reason for this is again because of the introduced __init__.py files.

To give more details, our monorepo has the following structure:

+ services
   | + some_api
        | - BUILD.bazel
          - setup.py
          + services
             | - __init__.py
               - some_code.py
         
+ libraries
    | + python
        | + some_library
           | - setup.py
             - BUILD.bazel
             + shared

Inside BUILD.bazel we have rules to run tests by calling a python script that use pytest, again the same issue is that import is broken in some of our code that relies on import side effect (not good but hard to change) b/c same python modules get imported twice with different __name__ one is libraries.python.some_library... and another time with some_library.... and this is due to first pytest autodiscovery magic but mostly because Bazel introduce __init__.py files in libraries/ folder and libraries/python/ folder, which sadly confuse pytest.

Workarounds

So far, we are working around the later issue by misusing --run_under Bazel flag to patch .runfiles and delete __init__.py in those places where they shouldn't be any.

To give an idea of the snippet of the code that we have in the script passed to --run_under here it is:

for entry in $(ls $RUNFILES_DIR); do
  if [[ "$entry" == pypi__* ]]; then
    python_ns_package='no'
    for subentry in $(ls $RUNFILES_DIR/$entry); do
      if [[ "$subentry" == *.pth ]]; then
        python_ns_package='yes'
        break
      fi
    done

    if [[ "$python_ns_package" == 'yes' ]]; then
      rm -f $RUNFILES_DIR/$entry/__init__.py

      for subentry in $(ls $RUNFILES_DIR/$entry); do
        if [ -f $RUNFILES_DIR/$entry/$subentry/__init__.py ]; then
          rm -f $RUNFILES_DIR/$entry/$subentry/__init__.py
        fi
      done

      if [ ! -z $SRV_NAME ]; then
        sed -i "\|#!/usr/bin/env python|a import site; site.addsitedir('$entry')" $RUNFILES_DIR/__main__/services/$SRV_NAME/service_test
      elif [ ! -z $LIB_NAME ]; then
        sed -i "\|#!/usr/bin/env python|a import site; site.addsitedir('$entry')" $RUNFILES_DIR/__main__/libraries/python/$LIB_NAME/default_test
      fi
    fi
  fi
done

It's really ugly but fixes the test. However the workaround is harder to implement when we integrate with py_image rule from docker_rules.

Proposed solution

I still don't understand why inserting __init__.py is needed and in which case, but IMHO such changes mess up with the way Python work and worst create a different environment than the one you see, so my proposal maybe adds a new flag in Bazel test and build to skip the __init__.py creation or maybe better (read local control) add a flag to py_binary and py_test rules to skip auto __init__.py creation, thoughts?

N.B. While Bazel has been great addon to our stack and I am very grateful to the effort the community put in to make it better, I am a bit unhappy with the fact that python rules are written not as Bazel extensions (using Skylar) but in Java, because this way fixing issues like the above is not as easy as forking repository changing code and changing WORKSPACE to use fork :(

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions

      0