8000 use `sys.platform == "cygwin"` to figure out when we are using cygwin by gcmarx · Pull Request #2027 · gitpython-developers/GitPython · GitHub
[go: up one dir, main page]

Skip to content

Conversation

@gcmarx
Copy link
Contributor
@gcmarx gcmarx commented May 22, 2025

The current logic for detecting cygwin is:

  • if sys.platform == "win32", then it's not cygwin
  • if the passed-in git executable is None, then it's not cygwin git
  • finally, is there a uname executable in the same folder as the git executable, and if so, does the output of that command include "CYGWIN"? if so, then it's cygwin

In the python 3.7 docs for sys.platform, Cygwin systems have sys.platform == "cygwin". Since Python 3.7 is the oldest version still supported by GitPython, it stands to reason that we can rely on that being true for all supported Python versions.

The logic I propose is:

  • if the git executable you passed in is None, it's not cygwin git
  • if sys.platform == 'cygwin', then it's cygwin git

This is simple enough for a single expression, so I replaced the body of the is_cygwin function with that expression and removed is_cygwin_git and friends completely.

I don't know if there's such a thing as sys.platform == 'cygwin' and the git not being cygwin git, but if there is, I don't think the existing code deals with that anyway, so this seems to return the same result and would fail in the same way.

@gcmarx
Copy link
Contributor Author
gcmarx commented May 22, 2025

only this PR or #2026 should be merged

Copy link
Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR simplifies the detection of Cygwin Git by replacing the former multi-step logic with a concise check that relies on sys.platform.

  • Removed deprecated functions and caching logic for Cygwin detection.
  • Updated the Git command interface to directly use sys.platform and the Git executable's existence.

Reviewed Changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

File Description
git/util.py Removed the py_where and _is_cygwin_git functions along with unused imports.
git/cmd.py Updated the is_cygwin method to use a simplified sys.platform check.

@gcmarx gcmarx changed the title use 'sys.platform == "cygwin"' to figure out when we are using cygwin use sys.platform == "cygwin" to figure out when we are using cygwin May 23, 2025
@Byron Byron requested a review from EliahKagan May 26, 2025 03:12
Copy link
Member
@EliahKagan EliahKagan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is preliminary and I'll try to say a little more in a forthcoming review of #2026, which I hope to do tomorrow.

I think we'll probably want to prefer the approach in #2026 over this, at least in the short term. There is a deliberate and long-standing (though not always perfectly applied) distinction throughout the code of GitPython between whether the platform is Cygwin (i.e. whether the python interpreter running the code of GitPython is a Cygwin program) and whether the git executable that GitPython calls is a Cygwin program.

This appears to be relevant to the treatment of absolute paths in some operations that involve cloning to an absolute path or adding a submodule. Most use of is_cygwin_git is through Git.is_cygwin, and searching for Git.is_cygwin reveals uses where it looks like this distinction can matter.

GitPython uses whatever git executable it is configured to use via environment variables or refresh or, if it is not configured to use a particular git executable, whatever executable it finds in a path search. In either case, this executable need not be built for the same target as GitPython.

I admittedly would not recommend most such mismatches between python and git if they can be avoided. I also think we should be willing to run the risk of breaking things in what I believe to be the unusual case of using Cygwin git from outside Cygwin, if doing so is necessary in order to avoid or fix breakages when Cygwin is not used--especially for bugs that affect users on Unix-like operating systems that aren't related to Windows. This is one of the reasons I anticipate that the approach in #2026 is probably okay.

But distinguishing Cygwin python using 10000 Cygwin git from Cygwin python using Git for Windows git seems like it is at least partially working. It's also fairly easy to produce the situation of Cygwin python using Git for Windows git. This can plausibly occur intentionally or by accident. Most Cygwin environments preserve Windows PATH entries, allowing them to appear after Cygwin-specific bin directories. If GitPython is used in Cygwin python, on a system where Cygwin git is not installed but Git for Windows is installed, then usually the non-Cygwin git will be found in the PATH, and GitPython will use it.

@gcmarx
Copy link
Contributor Author
gcmarx commented May 26, 2025

Thanks for the thorough review!

Neither the value of sys.platform nor “uname in the same directory as git” seem like a bulletproof way to tell whether the git in question is Cygwin git or not. I’m wondering if the git executable itself can be convinced to tell us about itself, but don’t currently have a way to test that. I’ll try to poke around at that later this week.

@EliahKagan
Copy link
Member
EliahKagan commented May 28, 2025

I've looked into this further.

Keeping uname can be temporary, I hope

I’m wondering if the git executable itself can be convinced to tell us about itself, but don’t currently have a way to test that. I’ll try to poke around at that later this week.

Thanks! Yes, using uname for this in the way we have been doing (or in any way at all, really) doesn't facilitate high-quality detection of whether git is a Cygwin program. Nonetheless, I think the changes in #2026 up to cffa264 (i.e. the tip of the branch prior to recent changes) are strictly improvements compared to what is on our main now, and that they can probably be integrated as an initial mitigation of--though still not a complete fix for--#1979.

I agree that, ultimately, we will want to do something else. I do think that will involve inspecting the contents of the git executable; though it might imaginably use some other technique to figure it out; or it might turn out not to be worthwhile for GitPython to try to figure this out, in which case it could still be changed to just check sys.platform == "cygwin", as in this PR.

Why I advocate starting that way

The reason I would favor the (original) approach taken in #2026 as a first step is that it is maximally consistent with the previous behavior. This is valuable because GitPython is widely used, and it's easy to break existing usage by accident, including in Cygwin-related ways (e.g. #1646).

I think that makes it a good initial change, and since it addresses #1979 in the cases where people seem usually to have encountered it--where the environment has nothing to do with Windows or Cygwin--we could even maybe do a patch release shortly afterwards. Even if not, it would be in place so that, if a patch release is done for some other reason, it will have it.

Also, as detailed below (at the very end), I am not sure if always returning True when sys.platform == "cygwin" does the right thing on MSYS2.

Counterpoints/caveats

Although my position is that we should not rush to replace the implementation of is_cygwin_git with a simple check to sys.platform == "cygwin", my analysis would be incomplete if I did not acknowledge three arguments for doing so:

  1. It's very simple.
  2. It's not necessarily less accurate than what we've been doing.
  3. Code from before 2022 that it would break has probably already broken.

The first point is self-evident, while the second and third touch on some things that are potentially of interest to continuing work on is_cygwin_git no matter how we proceed, and are thus detailed below.

2. The uname way isn't very accurate

Neither the value of sys.platform nor “uname in the same directory as git” seem like a bulletproof way to tell whether the git in question is Cygwin git or not.

Yes, the "uname in the same directory as git" way we've been doing is admittedly not good, and very brittle.

In terms of accuracy, it may be worse than the approach here of just checking sys.platform == "cygwin" and assuming git is a Cygwin program if the Python interpreter is a Cygwin program. This may be so even when changed to include such a check (though that change, in 3f5a942, is good and definitely one of the benefits of #2026). Here are a few examples of why this may be so:

  • GitPython could be used in a Python script run by (or otherwise through) git, such as a custom git command, a transport command, or a hook. Then the git executable GitPython finds will usually be the one in git's libexec/git-core directory--more precisely, the one in the directory git --exec-path reports. This is because git currently unconditionally prepends that directory to PATH when performing most subprocess invocations. That directory does not also contain a uname executable.

  • Checking if the git executable is in the same directory as a uname executable that reports the operating system as Cygwin is--when it works--effectively a way to check if git is in a Cygwin /usr/bin directory where the Cygwin package manager places both binaries.

    is_cygwin_git and supporting py_where functions are written as they are for historical reasons. As detailed below, when they were introduced, the check had different semantics, and the standard library had fewer facilities. But if we were to introduce wholly new code today that guesses, based on the location of git, whether git is from Cygwin, then (ignoring the MSYS2 issue) we would probably do something like this instead:

    sys.platform == "cygwin" and git_executable and shutil.which(git_executable) == "/usr/bin/git"

    The problem is that one can have Cygwin git elsewhere--even when using the "primary" git executable rather than the one in libexec/git-core. This happens if one builds it from source oneself to test a patch or use a different version, or if one runs it through a symlink.

  • Eliminating the existing implementation, as done in this PR, would solve other problems unrelated to #1979.

    When is_cygwin_git is used, the value of its git_executable argument is most often "git". To avoid an excessive slowdown, is_cygwin_git caches results per-argument in _is_cygwin_cache. This cache entry is never invalidated. But which executable "git" runs--and even where that executable is found--can change. This can happen by changing the contents of the filesystem or changing the value of the PATH environment variable. Such a change has other notable effects, including that the cached git version number may no longer be correct. We intend that programs that use GitPython can adjust to such a change by calling its top-level refresh function. For the most part, this works properly (including invalidating the version). But it does not modify _is_cygwin_cache.

    This can, of course, be fixed. But I acknowledge that if we no longer try to figure out if the git executable is a Cygwin program, then that would eliminate this bug automatically. (To be clear, the onus is not on #2026 to fix this bug, since nothing in that PR makes it any worse.)

So it may be that such cases--of which I suspect there are various others--make it so that the simpler change here of just using sys.platform == "cygwin" is more accurate not only than the current implementation but than anything resembling it. However, even if improving overall accuracy, we would run the risk of introducing new inaccuracies in particular unanticipated use cases.

3. The semantics of is_cygwin_git changed a few years ago

In #2027 (review) I noted that the distinction between the Python interpreter being a Cygwin program and the git executable being a Cygwin program is long-standing, and that it can be important. That's true. However, the current behavior of is_cygwin_git only really dates back to 2022. Anything in software using GitPython that a major change to is_cygwin_git would break now is probably not older than that, or it would've been broken then.

There have been two big pushes to improve Cygwin compatibility in GitPython:

  • The first was Support Cygwin's Git on Windows #533 in 2016, which introduced is_cygwin_git in e6e23ed. At that time, it began:

    GitPython/git/util.py

    Lines 287 to 288 in e6e23ed

    if not is_win:
    return False

    Where:

    is_win = (os.name == 'nt')

    I believe that, even at that time and even in Python 2, os.name on Cygwin evaluated to "posix" as it does today. Thus, the scenario where is_cygwin_git would return True was when the Python interpreter was a native Windows executable and git was a Cygwin executable! This makes sense, since the title of that PR was "Support Cygwin's Git on Windows".

    At that time, AppVeyor was used for CI. The .appveyor.yml file from that time confirms that this is what was going on--the original purpose of is_cygwin_git was to allow native Windows Python programs to use GitPython with either Git for Windows or Cygwin git. In relevant part:

    GitPython/.appveyor.yml

    Lines 2 to 38 in cc77e6b

    environment:
    GIT_DAEMON_PATH: "C:\\Program Files\\Git\\mingw64\\libexec\\git-core"
    CYGWIN_GIT_PATH: "C:\\cygwin\\bin;%GIT_DAEMON_PATH%"
    CYGWIN64_GIT_PATH: "C:\\cygwin64\\bin;%GIT_DAEMON_PATH%"
    matrix:
    ## MINGW
    #
    - PYTHON: "C:\\Python27"
    PYTHON_VERSION: "2.7"
    GIT_PATH: "%GIT_DAEMON_PATH%"
    - PYTHON: "C:\\Python34-x64"
    PYTHON_VERSION: "3.4"
    GIT_PATH: "%GIT_DAEMON_PATH%"
    - PYTHON: "C:\\Python35-x64"
    PYTHON_VERSION: "3.5"
    GIT_PATH: "%GIT_DAEMON_PATH%"
    - PYTHON: "C:\\Miniconda35-x64"
    PYTHON_VERSION: "3.5"
    IS_CONDA: "yes"
    GIT_PATH: "%GIT_DAEMON_PATH%"
    ## Cygwin
    #
    - PYTHON: "C:\\Miniconda-x64"
    PYTHON_VERSION: "2.7"
    IS_CONDA: "yes"
    IS_CYGWIN: "yes"
    GIT_PATH: "%CYGWIN_GIT_PATH%"
    - PYTHON: "C:\\Python35-x64"
    PYTHON_VERSION: "3.5"
    GIT_PATH: "%CYGWIN64_GIT_PATH%"
    IS_CYGWIN: "yes"
    install:
    - set PATH=%PYTHON%;%PYTHON%\Scripts;%GIT_PATH%;%PATH%

  • The second was Re-enable Cygwin CI and get most tests passing #1455 in 2022, which included this very significant change to is_cygwin_git in 96fae83:

    def is_cygwin_git(git_executable: Union[None, PathLike]) -> bool:
    -    if not is_win:
    +    if is_win:
    +        # is_win seems to be true only for Windows-native pythons
    +        # cygwin has os.name = posix, I think
           return False

    Much as #533 was guided by testing on AppVeyor set up so that Cygwin git was being called only from native Windows builds of Python, #1455 was guided by testing on GitHub Actions set up so that Cygwin git was being called only from Cygwin builds of Python.

    I cannot tell from the history whether this distinction was recognized at the time! It may be that this change was made based on the belief that is_cygwin_git had already been intended to return False except in Cygwin builds of Python, I am not sure. But it may be that it was intentional. It would make sense if, over time, it has become rarer and rarer to use anything but Git for Windows for git on Windows, except when operating entirely within some other specific environment.

An important question is whether, since #1455, there is anything in GitPython that actually works because of the ability to identify a non-Cygwin git executable from Cygwin. Operations with such an executable seem either to work on main, in the original #2026 (cffa264), and here... or not to work in any of the three. An example of something that should in principle work if GitPython on Cygwin supports non-Cygwin git, but that fails and fails the same way in all three, is:

(.venv) ✔ ~/repos-cygwin/GitPython [main|⚑ 1]
09:27 $ GIT_PYTHON_GIT_EXECUTABLE=/cygdrive/c/Users/ek/scoop/shims/git python3.9 -c 'import git; r1 = git.Repo.c
8000
lone_from("https://github.com/EliahKagan/trivial.git", "trivial"); r2 = r1.clone("trivial-clone", bare=True); print(r2.git_dir)'
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/home/ek/repos-cygwin/GitPython/git/repo/base.py", line 1482, in clone
    return self._clone(
  File "/home/ek/repos-cygwin/GitPython/git/repo/base.py", line 1414, in _clone
    finalize_process(proc, stderr=stderr)
  File "/home/ek/repos-cygwin/GitPython/git/util.py", line 504, in finalize_process
    proc.wait(**kwargs)
  File "/home/ek/repos-cygwin/GitPython/git/cmd.py", line 834, in wait
    raise GitCommandError(remove_password_if_present(self.args), status, errstr)
git.exc.GitCommandError: Cmd('/cygdrive/c/Users/ek/scoop/shims/git') failed due to: exit code(128)
  cmdline: /cygdrive/c/Users/ek/scoop/shims/git clone -v --bare -- /home/ek/repos-cygwin/GitPython/trivial/.git trivial-clone
  stderr: 'fatal: repository '/home/ek/repos-cygwin/GitPython/trivial/.git' does not exist
'

What happens there is that a decygpath operation is either not being done or not being done correctly in at least one place where it would need to happen to support Git for Windows git from Cygwin. I am not sure if this has ever worked.

So that's admittedly a reason to prefer this PR: it's not clear under what circumstances, if any, is_cygwin_git has done any better than this, since #1455.

But I think this examination of the history also shows that the original #2026 (up to cffa264) is safe and strictly improves correctness. The sys.platform == "win32" condition on main descends from the is_win check. (sys.platform == "win32" always has the same value as os.name == "nt", at least in Python 3 and I think since well before that.) When I introduced the TODO comment in 42e10c0 (#1859), I was worried that doing the check even when the platform doesn't seem to be Cygwin might've been intentional, and that checking if git seems to be a Cygwin executable only when sys.platform == "cygwin" might somehow break something. I think looking at the history confirms that this is not the case.

But another reason to start slow is MSYS2

Somewhat confusingly, Cygwin is not the only platform in which sys.platform is "cygwin".

We don't test MSYS2 on CI, though maybe we should. I don't know how often, if at all, people use GitPython in a Python interpreter that targets MSYS2. (This should not be confused with other environments that MSYS2 supplies, such as MINGW64, which is actually a native Windows target rather than a Cygwin-like target. I am talking about the MSYS environment of MSYS2, which is a non-Cygwin but Cygwin-like target.)

In MSYS2, sys.platform is "cygwin":

ek@Glub MSYS ~
$ python -c 'import sys; print(sys.platform)'
cygwin

But uname does not report CYGWIN:

ek@Glub MSYS ~
$ uname
MSYS_NT-10.0-19045

The facilities within GitPython that do Cygwin-related things, such as cygpath and decygpath, and their helpers, in git/util.py--which are used in more places when git is detected to be a Cygwin build--don't look like they will always work on MSYS2. While a MSYS2 does have /proc/cygdrive, it does not have /cygdrive; a path like /cygdrive/c in Cygwin is just /c is MSYS2. There may be other relevant differences.

As things stand now, where MSYS2 git run from MSYS2 is not considered to be a Cygwin git, MSYS2 does seem to work pretty well, though maybe not perfectly. I say that based on this test run; see #1988 (comment) for context. It may be good to look into that further before doing anything that would cause MSYS2 to be treated more like Cygwin.

This also applies to the strings approach pushed to #2026 after cffa264--strings will turn up cygwin in MSYS2 git as well:

ek@Glub MSYS ~
$ strings /usr/bin/git | grep -F cygwin
cygwin_conv_path
cygwin_internal
__imp_cygwin_internal
__imp_cygwin_conv_path

So this is part of why I'd be interested to integrate cffa264 first.

@EliahKagan
Copy link
Member

only this PR or #2026 should be merged

Since this was intended to be merged only if #2026 was not, and #2026 is merged, I've closed this.

It's plausible--though I think not most likely--that we'll eventually end up going this route and no longer attempting to determine whether the git executable is a Cygwin program. In that case, this could be reopened, or a new PR opened to propose that. (However, I think, based on #2026 (comment), that the next step may be to move to using ldd on the git binary after successfully finding it in a PATH search.)

@EliahKagan EliahKagan closed this May 30, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

2 participants

0