8000 gh-90385: Add `pathlib.Path.walk()` method by zmievsa · Pull Request #92517 · python/cpython · GitHub
[go: up one dir, main page]

Skip to content

gh-90385: Add pathlib.Path.walk() method #92517

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
Merged
Changes from 1 commit
Commits
Show all changes
39 commits
Select commit Hold shift + click to select a range
ac622b7
Add Path.walk and Path.walk_bottom_up methods
zmievsa May 8, 2022
14f031a
Fix errors in Path.walk docstrings and add caching of entries
zmievsa May 9, 2022
b203517
Merge branch 'main' into bpo-46227/add-pathlib.Path.walk-method
Ovsyanka83 May 9, 2022
3ad60a9
Refactor symlink handling
zmievsa May 9, 2022
889d7fe
Merge branch 'bpo-46227/add-pathlib.Path.walk-method' of github.com:O…
zmievsa May 9, 2022
2f98823
Add Path.walk docs and unite Path.walk interfaces
zmievsa May 10, 2022
513030a
Remove Path.walk_bottom_up definition
zmievsa May 10, 2022
5fdd72e
📜🤖 Added by blurb_it.
blurb-it[bot] May 10, 2022
452f24e
Add Path.walk tests
zmievsa May 10, 2022
3702a12
Make Path.walk variable naming consistent
zmievsa May 10, 2022
fabc925
Remove redundant FIXME
zmievsa May 10, 2022
b387b54
Minor Path.walk docs and tests fixes
zmievsa May 10, 2022
097fbbf
Merge branch 'main' into bpo-46227/add-pathlib.Path.walk-method
merwok Jun 27, 2022
76fadfc
Update Doc/library/pathlib.rst
Ovsyanka83 Jun 30, 2022
0c19871
Update Doc/library/pathlib.rst
Ovsyanka83 Jun 30, 2022
50b4a2b
Update Doc/library/pathlib.rst
Ovsyanka83 Jun 30, 2022
cade3e9
Update Doc/library/pathlib.rst
Ovsyanka83 Jun 30, 2022
b32627c
Update Doc/library/pathlib.rst
Ovsyanka83 Jun 30, 2022
d1a0833
Update Doc/library/pathlib.rst
Ovsyanka83 Jun 30, 2022
e367f1f
Update Doc/library/pathlib.rst
Ovsyanka83 Jun 30, 2022
bf8b0eb
Fix 'no blank lines' error
zmievsa Jun 30, 2022
d8667c7
Apply suggestions from code review
Ovsyanka83 Jul 3, 2022
4509797
More code review fixes for Path.walk
zmievsa Jul 3, 2022
20a73ed
Merge branch 'main' into bpo-46227/add-pathlib.Path.walk-method
Ovsyanka83 Jul 3, 2022
e61d57b
Merge branch 'main' into bpo-46227/add-pathlib.Path.walk-method
brettcannon Jul 8, 2022
15d96b9
Apply suggestions from code review
Ovsyanka83 Jul 9, 2022
92e1a7a
Apply suggestions from code review
Ovsyanka83 Jul 9, 2022
c509da3
Merge branch 'main' into bpo-46227/add-pathlib.Path.walk-method
Ovsyanka83 Jul 9, 2022
cfa730d
Code review fixes
zmievsa Jul 10, 2022
7aec96d
Clarify pathlib.Path.walk() error handling
zmievsa Jul 10, 2022
38fe1e5
Apply suggestions from code review
Ovsyanka83 Jul 10, 2022
eef3ba3
Code review fixes
zmievsa Jul 10, 2022
4dfdcd7
Merge branch 'bpo-46227/add-pathlib.Path.walk-method' of github.com:O…
zmievsa Jul 10, 2022
8fe3b62
Apply suggestions from code review
Ovsyanka83 Jul 12, 2022
e8ea6ba
Code review fixes
zmievsa Jul 12, 2022
79cf8fd
Remove backticks around True and False
zmievsa Jul 13, 2022
bed850e
Apply suggestions from code review
Ovsyanka83 Jul 17, 2022
203ec3d
Apply suggestions from code review
zmievsa Jul 17, 2022
eef6054
Apply suggestions from code review
brettcannon Jul 22, 2022
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
Fix errors in Path.walk docstrings and add caching of entries
  • Loading branch information
zmievsa committed May 9, 2022
commit 14f031af907602d56e396cd4f5ffccb3df39245f
79 changes: 48 additions & 31 deletions Lib/pathlib.py
Original file line number Diff line number Diff line change
Expand Up @@ -1385,64 +1385,79 @@ def expanduser(self):

return self

def walk(self, on_error=None, follow_links=False):
def walk(self, on_error=None, follow_symlinks=False):
"""Generate a top-down directory tree from this directory

For each directory in the directory tree rooted at self (including
self but excluding '.' and '..'), yields a 3-tuple

root_directory, child_directory_names, child_file_names
dirpath, dirnames, filenames

The caller can modify the child_directory_names list in-place
(e.g., via del or slice assignment), and walk will only recurse into
the subdirectories whose names remain in child_directory_names; this
The caller can modify the dirnames list in-place
(e.g., via del or slice assignment), and walk will only recurse
into the subdirectories whose names remain in dirnames; this
can be used to prune the search, or to impose a specific order of
visiting.

By default errors from Path._scandir() call are ignored. If
optional arg 'on_error' is specified, it should be a function; it
optional arg 'on_error' is specified, it should be a callable; it
will be called with one argument, an OSError instance. It can
report the error to continue with the walk, or raise the exception
to abort the walk. Note that the filename is available as the
filename attribute of the exception object.

By default, Path.walk does not follow symbolic links to subdirectories
on systems that support them. In order to get this functionality, set
the optional argument 'follow_links' to true.
the optional argument 'follow_symlinks' to true.

Caution: if self is a relative Path, don't change the
current working directory between resumptions of walk. walk never
changes the current directory, and assumes that the client doesn't
either.

Caution: walk assumes the directories have not been modified between
its resumptions. I.e. If a directory from dirnames has been replaced
with a symlink and follow_symlinks=False, walk will still try to
descend into it. To prevent such behavior, remove directories from
dirnames if they have been modified and you do not want walk to
descend into them anymore.

Example:

from pathlib import Path
for root, dirs, files in Path('python/Lib/email'):
print(root, "consumes", end="")
print(sum((root / file).stat().st_size for file in files), end="")
print("bytes in", len(files), "non-directory files")
if 'CVS' in dirs:
dirs.remove('CVS') # don't visit CVS directories
"""
sys.audit("pathlib.Path.walk", self, on_error, follow_links)
return self._walk(True, on_error, follow_links)

def walk_bottom_up(self, on_error=None, follow_links=False):
for root, dirs, files in Path('Lib/concurrent').walk(on_error=print):
print(
root,
"consumes",
sum((root / file).stat().st_size for file in files),
"bytes in",
len(files),
"non-directory files"
)
# don't visit __pycache__ directories
if '__pycache__' in dirs:
dirs.remove('__pycache__')
"""
sys.audit("pathlib.Path.walk", self, on_error, follow_symlinks)
return self._walk(True, on_error, follow_symlinks)

def walk_bottom_up(self, on_error=None, follow_symlinks=False):
"""Generate a bottom up directory tree from this directory

The return type and arguments are identical to Path.walk.
However, the caller cannot modify the child_directory_names to prune
the search or to impose a specific order of visiting because the list
of directories to visit is calculated before we yield it.

Unline walk, the caller cannot modify the dirnames to prune the
search or to impose a specific order of visiting because the
list of directories to visit is calculated before yielding it.
"""
sys.audit("pathlib.Path.walk_bottom_up", self, on_error, follow_links)
return self._walk(False, on_error, follow_links)
sys.audit("pathlib.Path.walk_bottom_up", self, on_error, follow_symlinks)
return self._walk(False, on_error, follow_symlinks)

def _walk(self, topdown, on_error, follow_links):
dirs = []
nondirs = []
walk_dirs = []
walk_dirs_map = {}

# We may not have read pe A951 rmission for self, in which case we can't
# get a list of the files the directory contains. os.walk
Expand Down Expand Up @@ -1476,6 +1491,8 @@ def _walk(self, topdown, on_error, follow_links):

if is_dir:
dirs.append(entry.name)
walk_dirs_map[entry.name] = entry

else:
nondirs.append(entry.name)

Expand All @@ -1497,16 +1514,16 @@ def _walk(self, topdown, on_error, follow_links):
if walk_into:
walk_dirs.append(entry)
if topdown:
for raw_new_path in dirs:
# Path.is_symlink() is used instead of caching entry.is_symlink()
# result during the loop on Path._scandir() because the caller
# can replace the directory entry during the "yield" above.
new_path = self._make_child_relpath(raw_new_path)
if follow_links or not new_path.is_symlink():
yield self, dirs, nondirs

for dir_name in dirs:
new_path = self._make_child_relpath(dir_name)

if follow_links or not walk_dirs_map[dir_name].is_symlink():
yield from new_path._walk(topdown, on_error, follow_links)
else:
for raw_new_path in walk_dirs:
new_path = self._make_child_relpath(raw_new_path.name)
for dir_entry in walk_dirs:
new_path = self._make_child_relpath(dir_entry.name)
yield from new_path._walk(topdown, on_error, follow_links)
# Yield after recursion if going bottom up
yield self, dirs, nondirs
Expand Down
0