8000 bpo-45234: Fix FileNotFound exception raised instead of IsADirectoryError in shutil.copyfile() by akulakov · Pull Request #28421 · python/cpython · GitHub
[go: up one dir, main page]

Skip to content

bpo-45234: Fix FileNotFound exception raised instead of IsADirectoryError in shutil.copyfile() #28421

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Sep 21, 2021

Conversation

akulakov
Copy link
Contributor
@akulakov akulakov commented Sep 17, 2021

Note that os.path.isdir() correctly handles non-existent 'foo/' paths, returning False, unlike open() which on some platforms raises IsADirectoryError for such path.

Also note that I chose to raise IsADirectory error in cases when src is a dir, and dst is a 'does_not_exist/' path, when in theory either error could be raised, but IsADirectory is more backwards-compatible. (and a bit more logical because it's the first error encountered).

https://bugs.python.org/issue45234

Lib/shutil.py Outdated
@@ -279,7 +279,7 @@ def copyfile(src, dst, *, follow_symlinks=True):

# Issue 43219, raise a less confusing exception
except IsADirectoryError as e:
if os.path.exists(dst):
if os.path.isdir(src) or os.path.isdir(dst):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer to move the open for the destination file to a separate with statement, and only handle the error for that case. The problem with a trailing slash is only with dst. It's opened in write mode, which is disallowed for a directory, so the EISDIR error occurs before the system even checks whether the file or directory exists. If src doesn't exist, then a trailing slash doesn't matter. For example:

>>> open('spam/', 'rb')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
FileNotFoundError: [Errno 2] No such file or directory: 'spam/'

>>> open('spam/', 'wb')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
IsADirectoryError: [Errno 21] Is a directory: 'spam/'

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wasn't completely sure if on some OSes open() may return IsADirectory error in 'rb' mode, but I think it's not very likely and we already confirmed that's not the case for Linux, MacOS and Windows, so I agree with adding a separate with statement.

With 'isdir()' check, the logic is straightforward: if the error is 'IsADir' and the path is in fact a dir, it's ok to raise the error. But this relies on the subtle difference between open() and isdir() handling of non-existent "directory".

On the other hand it may be clearer to use exists() logic (pseudocode):

...
except IsADirectoryError:
  if not pathlib.exists(path):  # if it doesn't exist, it's more accurate to raise the following
     raise FileNotFoundError()

So I'm leaning to use the latter, does that sound good?

@akulakov
Copy link
Contributor Author

@eryksun as the test failures show, on windows a permission error may also be raised, where it may be more accurate and possible to report an IsADirectory error. Do you think this can be fixed in this PR as well? Or open a separate issue, or leave it as is?

It happens if you do copyfile(dir, ...), which opens it in 'rb' mode; I'm not sure if it happens in 'wb' mode on windows, do you happen to remember?

@eryksun
Copy link
Contributor
eryksun commented Sep 17, 2021

on windows a permission error may also be raised

Opening an existing directory in Windows requires calling CreateFileW() with the flag FILE_FLAG_BACKUP_SEMANTICS. Otherwise Windows calls the NTAPI system function NtCreateFile() with the option FILE_NON_DIRECTORY_FILE, which will fail with the status code STATUS_FILE_IS_A_DIRECTORY. If the name has a trailing backslash (not a trailing forward slash), the API maps the latter status code to ERROR_PATH_NOT_FOUND (3), which is C ENOENT (2). Otherwise it maps it to ERROR_ACCESS_DENIED (5), which is C EACCES (13).

At the Python level, opening an existing directory in Windows with open() or os.open(), regardless of the requested disposition and access, will raise either PermissionError or, if there's a trailing backslash, FileNotFoundError. For example:

>>> os.path.isdir('spam')
True

>>> os.open('spam', 0)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
PermissionError: [Errno 13] Permission denied: 'spam'

>>> os.open('spam', os.O_CREAT | os.O_RDWR)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
PermissionError: [Errno 13] Permission denied: 'spam'

>>> os.open('spam/', 0)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
PermissionError: [Errno 13] Permission denied: 'spam/'

>>> os.open('spam\\', 0)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
FileNotFoundError: [Errno 2] No such file or directory: 'spam\\'

There's an undocumented open() flag to make it call CreateFileW() with FILE_FLAG_BACKUP_SEMANTICS, but it's not generally useful. For example:

>>> os.open('spam/', 0x2000)
3
>>> os.close(3)

In the case where the last component doesn't exist, and the name ends with a trailing slash or backslash, the error will depend on the open/create disposition. If it's an open disposition, the error is ERROR_FILE_NOT_FOUND (2), which is C ENOENT (2). If it's a create disposition, the error will be ERROR_INVALID_NAME (123), which is C EINVAL (22). The latter occurs because a name with a trailing slash is a directory, but the requested file type to create is a regular file, not a directory. For example:

>>> os.path.exists('spam')
False

>>> os.open('spam/', 0)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
FileNotFoundError: [Errno 2] No such file or directory: 'spam/'

>>> os.open('spam/', os.O_CREAT)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
OSError: [Errno 22] Invalid argument: 'spam/'

There are a couple of ways to create a directory with CreateFileW(), but they're outside the norm and not worth discussing here.

@akulakov
Copy link
Contributor Author

@eryksun Thanks, that's a very detailed and useful explanation :)
I guess I was trying to say that if I'm doing shutil.copyfile(a,b) and a is not accessible, I would expect a Permission error; but if a is accessible dir, I would expect an IsADirectory error, like on the other platforms, and I would find Permission error to be misleading.

For open() it's of course fine, but with a more high level shutil module it's probably more appropriate to report IsADirectory error in such cases.

Unfortunately I don't see a way to do this easily that wouldn't lead to a code that's too complicated for what we get out of it.

The outer with statement would have to be wrapped in another level of try/except, or, if we went to a single with statement as before, exception would have to be examined to tell if it refers to the src or dest argument.

I searched the BPO site and didn't find any complaints about this so I think it's not worth fixing (but maybe documenting?); -- unless you see a way to do it in a simpler way than what I was suggesting above.

I will fix the unit test failures a bit later tonight..

@eryksun
Copy link
Contributor
eryksun commented Sep 18, 2021

I guess I was trying to say that if I'm doing shutil.copyfile(a,b) and a is not accessible, I would expect a Permission error; but if a is accessible dir, I would expect an IsADirectory error, like on the other platforms, and I would find Permission error to be misleading.

I've seen novices waste time trying to figure out why they're getting a PermissionError. They don't think to check for a directory, which they would know immediately if IsADirectoryError were raised. It's also strange that adding a trailing backslash (but not forward slash) changes the error to ERROR_PATH_NOT_FOUND (3), which maps to FileNotFoundError. This error implies there's a missing component in the base path, or in some cases that the path is too long. It's all very misleading given the problem is that the final path component is a directory.

If Python called CreateFileW() directly instead of via C open(), it could rely on the NT status code from RtlGetLastNtStatus() -- such as STATUS_FILE_IS_A_DIRECTORY. In particular, PyErr_SetFromWindowsErr() could set the last NT status as an ntstatus attribute on the OSError exception. In some cases, where it's known to be valid, error handling code could use the NT status code to raise a different exception, without introducing a race condition, as opposed to checking os.path.isdir() after the fact.

@akulakov
Copy link
Contributor Author

@eryksun makes sense.
I've also fixed the test failures, please take a look..

self.assertRaises(err, shutil.copyfile, src_file, src_dir)
self.assertRaises(err, shutil.copyfile, dir2, src_dir)
self.assertRaises(err, shutil.copy, dir2, src_dir)
self.assertRaises(err, shutil.copy2, dir2, src_dir)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this just test copyfile()?

Note that for Windows there's been some discussion about changing copy2(), and maybe copy(), to call the system CopyFileExW() function, which also copies alternate data streams, extended attributes, and security resource attributes. In that case the high-level copy functions will no longer use copyfile(), copymode(), and copystat() in Windows.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's worth testing copy and copy2 for consistency of handling a dir arg, because they already do some dir-related logic (for dst arg), but this may be done here or in separate tests. Looking at other tests, to follow the same naming pattern, we can do test_copy_dirs() and test_copy2_dirs().

But it makes sense to do this testing in the same function for all three because we're ensuring consistency and also it's convenient because the set up of dirs and expected errors is the same.

I can remove these lines though if you think it's not worth testing them here at this point in view of the other work you mentioned.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently what copy() and copy2() do that's related to src being a directory is implemented in copyfile(), so only the latter really needs to be tested. I see the point about checking consistency, but would separate tests maybe be more obvious for someone who's modifying copy() or copy2()?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good, I will extract them into separate tests a bit later today.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've updated the tests and they're passing, please take a look..

@ambv ambv changed the title bpo-45234: fix wrong FileNotFound exception raised in shutil.copyfile() bpo-45234: Fix FileNotFound exception raised instead of IsADirectoryError in shutil.copyfile() Sep 21, 2021
@ambv ambv merged commit b7eac52 into python:main Sep 21, 2021
@ambv ambv added needs backport to 3.9 only security fixes needs backport to 3.10 only security fixes labels Sep 21, 2021
@miss-islington
Copy link
Contributor

Thanks @akulakov for the PR, and @ambv for merging it 🌮🎉.. I'm working now to backport this PR to: 3.10.
🐍🍒⛏🤖

@miss-islington
Copy link
Contributor

Thanks @akulakov for the PR, and @ambv for merging it 🌮🎉.. I'm working now to backport this PR to: 3.9.
🐍🍒⛏🤖

miss-islington pushed a commit to miss-islington/cpython that referenced this pull request Sep 21, 2021
…rror in shutil.copyfile() (pythonGH-28421)

This was a regression from fixing BPO-43219.
(cherry picked from commit b7eac52)

Co-authored-by: andrei kulakov <andrei.avk@gmail.com>
@bedevere-bot
Copy link

GH-28507 is a backport of this pull request to the 3.9 branch.

@bedevere-bot bedevere-bot removed the needs backport to 3.9 only security fixes label Sep 21, 2021
miss-islington pushed a commit to miss-islington/cpython that referenced this pull request Sep 21, 2021
…rror in shutil.copyfile() (pythonGH-28421)

This was a regression from fixing BPO-43219.
(cherry picked from commit b7eac52)

Co-authored-by: andrei kulakov <andrei.avk@gmail.com>
@bedevere-bot
Copy link

GH-28508 is a backport of this pull request to the 3.10 branch.

@bedevere-bot bedevere-bot removed the needs backport to 3.10 only security fixes label Sep 21, 2021
ambv pushed a commit that referenced this pull request Sep 21, 2021
…rror in shutil.copyfile() (GH-28421) (GH-28508)

This was a regression from fixing BPO-43219.
(cherry picked from commit b7eac52)

Co-authored-by: andrei kulakov <andrei.avk@gmail.com>
ambv pushed a commit that referenced this pull request Sep 21, 2021
…rror in shutil.copyfile() (GH-28421) (GH-28507)

This was a regression from fixing BPO-43219.
(cherry picked from commit b7eac52)

Co-authored-by: andrei kulakov <andrei.avk@gmail.com>
pablogsal pushed a commit that referenced this pull request Sep 22, 2021
…rror in shutil.copyfile() (GH-28421) (GH-28508)

This was a regression from fixing BPO-43219.
(cherry picked from commit b7eac52)

Co-authored-by: andrei kulakov <andrei.avk@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants
0