8000 fix codecache write_atomic path issue on Windows. by xuhancn · Pull Request #138331 · pytorch/pytorch · GitHub
[go: up one dir, main page]

Skip to content

fix codecache write_atomic path issue on Windows. #138331

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed

Conversation

xuhancn
Copy link
Collaborator
@xuhancn xuhancn commented Oct 18, 2024

Fixes #138211

Path.rename function has Windows OS specific behavior, that will raise FileExistsError when the target file existing.
This behavior is not happened on Linux, so I write a small repoduce code to figure out what happened.

After stepping trace the repo code:

import os
import sys
from pathlib import Path

_IS_WINDOWS = sys.platform == "win32"

def test_case():
    cwd = os.getcwd()
    path1 = os.path.join(cwd, "haha1.txt")
    path2 = Path(os.path.join(cwd, "haha2.txt"))

    try:
        path2.rename(path1)
    except FileExistsError as e_file_exist:
        if _IS_WINDOWS:
            # on Windows file exist is expected: https://docs.python.org/3/library/pathlib.html#pathlib.Path.rename
            shutil.copy2(path2, path1)
            os.remove(path2)
        else:
            raise e_file_exist
    except BaseException as e:
        raise e
    
    print("run here.")

if __name__ == "__main__":
    test_case()

We found the code path2.rename(path1) can breakdown into:

  1. copy file2's content to file1.
  2. delete file2.

So, we can implemented equal code on Windows path:

shutil.copy2(src=tmp_path, dst=path)
os.remove(tmp_path)

So, we can get current PR.

TODO: need cherry-pick to release/2.5 branch, CC: @atalman .

cc @peterjc123 @mszhanyi @skyline75489 @nbcsm @iremyux @Blackhex @jgong5 @mingfeima @XiaobingSuper @sanchitintel @ashokei @jingxu10 @voznesenskym @penguinwu @EikanWang @Guobing-Chen @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @ipiszy @yf225 @chenyang78 @kadeng @muchulee8 @ColinPeppler @amjames @desertfire @chauhang @aakhundov

Copy link
pytorch-bot bot commented Oct 18, 2024

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/138331

Note: Links to docs will display an error until the docs builds have been completed.

❌ 5 New Failures, 1 Unrelated Failure

As of commit 92ee32e with merge base aa3ae50 (image):

NEW FAILURES - The following jobs have failed:

FLAKY - The following job failed but was likely due to flakiness present on trunk:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@xuhancn xuhancn added module: windows Windows support for PyTorch ciflow/trunk Trigger trunk jobs on your pull request intel This tag is for PR from Intel labels Oct 18, 2024
@xuhancn xuhancn added this to the 2.5.1 milestone Oct 18, 2024
@xuhancn xuhancn requested a review from malfet October 18, 2024 14:56
@xuhancn xuhancn added the topic: not user facing topic category label Oct 18, 2024
@xuhancn xuhancn force-pushed the xu_fix_codecache_path_issue_on_windows branch from 94b3999 to 6113e3e Compare October 18, 2024 15:08
@xuhancn xuhancn removed the request for review from malfet October 18, 2024 15:31
@xuhancn xuhancn marked this pull request as draft October 18, 2024 15:31
@xuhancn xuhancn requested a review from malfet October 18, 2024 15:54
@xuhancn xuhancn marked this pull request as ready for review October 18, 2024 15:54
@xuhancn xuhancn marked this pull request as draft October 18, 2024 17:43
@xuhancn xuhancn removed the request for review from malfet October 18, 2024 17:43
@xuhancn xuhancn force-pushed the xu_fix_codecache_path_issue_on_windows branch 2 times, most recently from 6ff4335 to ac8408a Compare October 18, 2024 18:10
@xuhancn xuhancn marked this pull request as ready for review October 18, 2024 18:10
@xuhancn xuhancn requested a review from malfet October 18, 2024 18:10
@xuhancn xuhancn force-pushed the xu_fix_codecache_path_issue_on_windows branch from ac8408a to 31470a5 Compare October 18, 2024 18:13
@xuhancn
Copy link
Collaborator Author
xuhancn commented Oct 18, 2024

@pytorchbot rebase

@pytorchmergebot
Copy link
Collaborator

@pytorchbot started a rebase job onto refs/remotes/origin/viable/strict. Check the current status here

1. handle FileExistsError on Windows.
2. Breakdown this code into equal Windows code.
@pytorchmergebot
Copy link
Collaborator

Successfully rebased xu_fix_codecache_path_issue_on_windows onto refs/remotes/origin/viable/strict, please pull locally before adding more changes (for example, via git checkout xu_fix_codecache_path_issue_on_windows && git pull --rebase)

@pytorchmergebot pytorchmergebot force-pushed the xu_fix_codecache_path_issue_on_windows branch from 31470a5 to 76ff6f1 Compare October 18, 2024 19:07
Copy link
Contributor
@malfet malfet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, few comments though (also, I guess we test it in prod...)

@xuhancn
Copy link
Collaborator Author
xuhancn commented Oct 18, 2024

I tested the new code both on Windows/Linux. And confirmed they are works. @malfet

import os
import sys
from pathlib import Path
import shutil 

_IS_WINDOWS = sys.platform == "win32"

def test_case():
    cwd = os.getcwd()
    path1 = os.path.join(cwd, "haha1.txt")
    path2 = Path(os.path.join(cwd, "haha2.txt"))
    
    try:
        path2.rename(target=path1)
    except FileExistsError as e_file_exist:
        if not _IS_WINDOWS:
            raise
        # On Windows file exist is expected: https://docs.python.org/3/library/pathlib.html#pathlib.Path.rename
        # Below two lines code is equal to `tmp_path.rename(path)` on non-Windows OS.
        # 1. Copy tmp_file to Target(Dst) file.
        shutil.copy2(src=path2, dst=path1)
        # 2. Delete tmp_file.
        os.remove(path2)

    print("run here.")

if __name__ == "__main__":
    test_case()

@janeyx99 janeyx99 added the triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module label Oct 18, 2024
@xuhancn
Copy link
Collaborator Author
xuhancn commented Oct 19, 2024

@pytorchmergebot merge -f "skip unrelated error."

@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged immediately since you used the force (-f) flag, bypassing any CI checks (ETA: 1-5 minutes). Please use -f as last resort and instead consider -i/--ignore-current to continue the merge ignoring current failures. This will allow currently pending tests to finish and report signal before the merge.

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

@xuhancn xuhancn deleted the xu_fix_codecache_path_issue_on_windows branch October 19, 2024 01:27
@malfet
Copy link
Contributor
malfet commented Oct 23, 2024

2.5.1 is an emergency release to address large regression, so fixes to a new features do not meet the criteria to be included into it.

@kit1980 kit1980 modified the milestones: 2.5.1, 2.6.0 Oct 23, 2024
xuhancn added a commit to xuhancn/pytorch that referenced this pull request Oct 26, 2024
Fixes pytorch#138211

`Path.rename` function has Windows OS specific behavior, that will raise `FileExistsError` when the target file existing.
This behavior is not happened on Linux, so I write a small repoduce code to figure out what happened.

After stepping trace the repo code:
```python
import os
import sys
from pathlib import Path

_IS_WINDOWS = sys.platform == "win32"

def test_case():
    cwd = os.getcwd()
    path1 = os.path.join(cwd, "haha1.txt")
    path2 = Path(os.path.join(cwd, "haha2.txt"))

    try:
        path2.rename(path1)
    except FileExistsError as e_file_exist:
        if _IS_WINDOWS:
            # on Windows file exist is expected: https://docs.python.org/3/library/pathlib.html#pathlib.Path.rename
            shutil.copy2(path2, path1)
            os.remove(path2)
        else:
            raise e_file_exist
    except BaseException as e:
        raise e

    print("run here.")

if __name__ == "__main__":
    test_case()
```
We found the code `path2.rename(path1)` can breakdown into:
1. copy file2's content to file1.
2. delete file2.

So, we can implemented equal code on Windows path:
```python
shutil.copy2(src=tmp_path, dst=path)
os.remove(tmp_path)
```

So, we can get current PR.

TODO: need cherry-pick to release/2.5 branch, CC: @atalman .

Pull Request resolved: pytorch#138331
Approved by: https://github.com/malfet

Co-authored-by: Nikita Shulga <2453524+malfet@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ciflow/inductor ciflow/trunk Trigger trunk jobs on your pull request intel This tag is for PR from Intel Merged module: inductor module: windows Windows support for PyTorch open source topic: not user facing topic category triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module
Projects
None yet
Development

Successfully merging this pull request may close these issues.

tmp_path.rename in inductor code cache on Windows
6 participants
0