10000 MAINT: Simplify block implementation by j-towns · Pull Request #9667 · numpy/numpy · GitHub
[go: up one dir, main page]

Skip to content

MAINT: Simplify block implementation #9667

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 23 commits into from
Nov 12, 2017
Merged
Changes from 1 commit
Commits
Show all changes
23 commits
Select commit Hold shift + click to select a range
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
Update block docstrings
  • Loading branch information
j-towns committed Oct 12, 2017
commit bd6729d0d1a6004360f44524a477524ccdc96f17
18 changes: 14 additions & 4 deletions numpy/core/shape_base.py
Original file line number Diff line number Diff line change
Expand Up @@ -378,9 +378,12 @@ def _block_check_depths_match(arrays, parent_index=[]):
The parameter `parent_index` is the full index of `arrays` within the
nested lists passed to _block_check_depths_match at the top of the
recursion.
The return value is the full index of an element (specifically the
first element) from the bottom of the nesting in `arrays`. An empty
list at the bottom of the nesting is represented by a `None` index.
The return value is a pair. The first item returned is the full index
of an element (specifically the first element) from the bottom of the
nesting in `arrays`. An empty list at the bottom of the nesting is
represented by a `None` index.
The second item is the maximum of the ndims of the arrays nested in
`arrays`.
"""
def format_index(index):
idx_str = ''.join('[{}]'.format(i) for i in index if i is not None)
Expand Down Expand Up @@ -423,6 +426,13 @@ def format_index(index):


def _block(arrays, max_depth, result_ndim):
"""
Internal implementation of block. `arrays` is the argument passed to
block. `max_depth` is the depth of nested lists within `arrays` and
`result_ndim` is the greatest of the dimensions of the arrays in
`arrays` and the depth of the lists in `arrays` (see block docstring
for details).
"""
def atleast_nd(a, ndim):
# Ensures `a` has at least `ndim` dimensions by prepending
# ones to `a.shape` as necessary
Expand All @@ -436,7 +446,7 @@ def block_recursion(arrays, depth=0):
return _nx.concatenate(arrs, axis=-(max_depth-depth))
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This now reflects the wording of the docstring quite closely:

Blocks in the innermost lists are concatenated (see `concatenate`) along
the last dimension (-1), then these are concatenated along the
second-last dimension (-2), and so on until the outermost list is reached.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't -(max_depth-depth) just be (depth-max_depth)?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also slight personal preference to having spaces around binary operators for readability.

Copy link
Contributor Author
@j-towns j-towns Oct 31, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't -(max_depth-depth) just be (depth-max_depth)?

It would, yeah. The reason I wrote it that way round was because I wanted to match the docstring as closely as possible (my original motivation for this pr was to make it a bit clearer to people reading the code how block actually worked). The docstring says:

Blocks in the innermost lists are concatenated (see concatenate) along the last dimension (-1), then these are concatenated along the second-last dimension (-2), and so on until the outermost list is reached.

To me the correspondence between the docstring and the code is ever-so-slightly clearer with the expression the way round that it currently is, and I think the effect on performance is probably negligable.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also slight personal preference to having spaces around binary operators for readability.

I often instinctively shorten things when they're being fed to a keyword argument, because you don't normally put spaces around the = sign. Don't

axis=-(max_depth - depth)

and

axis=depth - max_depth

look a bit weird?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The style point is minor. Though there are many tools (e.g. flake8) that consider not having spaces around binary operators an error regardless of context. Normally this is argued from the standpoint of readability (including in PEP8 when it was introduced). Hence why I mentioned it. Adding parentheses around the keyword argument's value help guide the eye when following these style tools' recommendations. Not strongly attached to this styling nit though.

else:
# We've 'bottomed out' - arrays is either a scalar or an array
# depth == max_depth
# type(arrays) is not list
return atleast_nd(arrays, result_ndim)

return block_recursion(arrays)
Expand Down
0