10000 ENH: np.dot: better "matrices not aligned" message by larsmans · Pull Request #4985 · numpy/numpy · GitHub
[go: up one dir, main page]

Skip to content

ENH: np.dot: better "matrices not aligned" message #4985

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 2 commits into from

Conversation

larsmans
Copy link
Contributor

This scratches a long-standing itch of mine, which is that np.dot's "matrices not aligned" message never explains which of the two arguments I forgot to transpose somewhere deep inside an algorithm. This patch makes it report the mismatching pair of dimensions.

It might be even prettier to report the full shape of both inputs, but I think this is a big enough improvement for now.

@juliantaylor
Copy link
Contributor

thats very good change, it would be even better if it also reports the axis of the mismatch not just the sizes of the mismatch

@larsmans
Copy link
Contributor Author

Oh, this is probably going to conflict with #4969...

@juliantaylor
Copy link
Contributor

I'd give this PR precedence, this is simple and useful we should probably also add it to 1.9

not_aligned(Py_ssize_t m, Py_ssize_t n)
{
PyErr_Format(PyExc_ValueError, "matrices are not aligned: %zd != %zd",
m, n);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe as message:

 matrices are not aligned: shape[%d] (%zd) != shape[%d] (%zd)

it should mention shape, because its not obvious what aligned even means

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea, I'll see if I can hack this up.

@charris
Copy link
Member
charris commented Aug 23, 2014

Definitely a conflict with #4969, as _dotblas is no more after that PR ;) But an easy fix.

@larsmans
Copy link
Contributor Author

Ok, put in the improvements suggested by @juliantaylor.

@juliantaylor
Copy link
Contributor

multiarraymodule.c:976 has another instance of the message but with objects instead of matrix

@larsmans
Copy link
Contributor Author

Hm. That makes me want to factor the message out there as well. Should I put the function in multiarray/common.c?

@charris
Copy link
Member
charris commented Aug 23, 2014

@larsmans That would be OK with me. #4969 also adds some stuff there...

@juliantaylor
Copy link
Contributor

hm I guess one could put it there
the output shape error new_array_for_sum could also be improved for the out= argument. the wrong type error message is ok as thats easier for the user to figure out.

can you please also run this command on the branch to allow easy merge to 1.9 (assuming origin is remote name main numpy repo):

git rebase --onto $(git merge-base origin/maintenance/1.9.x HEAD^) HEAD^

@larsmans
Copy link
Contributor Author

@charris I saw you moved some of my code there, that's why I asked :)

@juliantaylor And then force-push?

@juliantaylor
Copy link
Contributor

yes

@larsmans
Copy link
Contributor Author

It looks like _dotblas.so isn't linking to the library that contains common.o, I'm getting a linker error.

@juliantaylor
Copy link
Contributor

urg, jsut duplicate the function, not the end of the world. we can clean it up in the reorg PR which does buildsystem changes anyway

@juliantaylor
Copy link
Contributor

putting it in common.h would be an option

@charris
Copy link
Member
charris commented Aug 23, 2014

Actually, that's where I put my functions...

@larsmans
Copy link
Contributor Author

Ok, it's in common.h and I rebased onto the 1.9.x branch.

@njsmith
Copy link
Member
njsmith commented Aug 23, 2014

I prefer an orange bikeshed, and also for this error message to just print the full shape of both arguments, because often this makes it obvious what happened (e.g. if I try to dot arrays with shapes (10, 3) and (10, 7), then obviously I forgot to transpose the first, but this is not so clear if the message just says 3 != 10).

@juliantaylor
Copy link
Contributor

it does now say:

shape[1] (3) != shape[0] (10)

additionally printing the full shapes could be useful but its also quite easy to just print them from the input arrays.

@njsmith
Copy link
Member
njsmith commented Aug 23, 2014

It's often not so easy for the user to print the original shapes - this
might require modifying and rerunning a slow script, or figuring out how to
run pdb on a webserver, etc.

It's not the most important issue ever, but I do think it would be better
for our error message to just print the full shapes.
On 23 Aug 2014 18:48, "Julian Taylor" notifications@github.com wrote:

it does now say:

shape1 != shape0

additionallz printing the full shapes could be useful but its also quite
easy to just print them from the input arrays.


Reply to this email directly or view it on GitHub
#4985 (comment).

@larsmans
Copy link
Contributor Author

One orange bikeshed for mr. @njsmith:

>>> x = np.zeros((10, 2))
>>> np.dot(x, x)
Traceback (most recent call last):
  File "<ipython-input-8-7170710dec0d>", line 1, in <module>
    np.dot(x, x)
ValueError: matrices of shapes (10, 2) and (10, 2) not aligned: 2 != 10

>>> x = np.zeros((10, 2, 4, 3, 7))
>>> np.dot(x, x)
Traceback (most recent call last):
  File "<ipython-input-10-7170710dec0d>", line 1, in <module>
    np.dot(x, x)
ValueError: matrices of shapes (10, 2, 4, 3, 7) and (10, 2, 4, 3, 7) not aligned: 7 != 3

}
else {
PyErr_Format(PyExc_ValueError,
"matrices of shapes %s and %s not aligned: %zd != %zd",
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shall I change this to "arrays" instead of "matrices"?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or just drop matrix of

@juliantaylor
Copy link
Contributor

I'd still keep the shape index as in the original, that way its clear which axis is the problem in case there a duplicates.

@larsmans
Copy link
Contributor Author

d3c2a74fd43ce8bb9c9c06176473554216fd74d0 reuses convert_shape_to_string (at the expense of readability) and improves the error message:

>>> x = np.zeros((10, 2))
>>> np.dot(x, x)
Traceback (most recent call last):
  File "<ipython-input-5-7170710dec0d>", line 1, in <module>
    np.dot(x, x)
ValueError: shapes (10,2) and (10,2) not aligned: 2 (dim 1) != 10 (dim 0)

>>> x = np.zeros((10, 2, 4, 3, 7))
>>> np.dot(x, x)
Traceback (most recent call last):
  File "<ipython-input-6-7170710dec0d>", line 1, in <module>
    np.dot(x, x)
ValueError: shapes (10,2,4,3,7) and (10,2,4,3,7) not aligned: 7 (dim 4) != 3 (dim 3)

Please review, I'm calling it quits: it's midnight over here.

@juliantaylor
Copy link
Contributor

how to turn a 20 line diff into a 200 line diff by adding redundant information ...

would possibly be simpler by decoding the pystring to ascii, but this is fine too.
it adds a bunch of warnings though:

numpy/core/src/multiarray/common.h:232:9: warning: implicit declaration of function ‘PyUString_FromFormat’ [-Wimplicit-function-declaration]
numpy/core/src/multiarray/common.h:232:9: warning: return makes pointer from integer without a cast [enabled by default]
numpy/core/src/multiarray/common.h:235:13: warning: assignment makes pointer from integer without a cast [enabled by default]
numpy/core/src/multiarray/common.h:243:13: warning: implicit declaration of function ‘PyUString_FromString’ [-Wimplicit-function-declaration]
numpy/core/src/multiarray/common.h:243:17: warning: assignment makes pointer from integer without a cast [enabled by default]
numpy/core/src/multiarray/common.h:246:17: warning: assignment makes pointer from integer without a cast [enabled by default]
numpy/core/src/multiarray/common.h:253:9: warning: implicit declaration of function ‘PyUString_ConcatAndDel’ [-Wimplicit-function-declaration]
numpy/core/src/multiarray/common.h:260:13: warning: assignment makes pointer from integer without a cast [enabled by default]
numpy/core/src/multiarray/common.h:263:13: warning: assignment makes pointer from integer without a cast [enabled by default]
numpy/core/src/multiarray/common.h:282:12: warning: assignment makes pointer from integer without a cast [enabled by default]
numpy/core/src/multiarray/common.h:305:5: warning: implicit declaration of function ‘PyUString_Format’ [-Wimplicit-function-declaration]
numpy/core/src/multiarray/common.h:305:12: warning: assignment makes pointer from integer without a cast [enabled by default]

@juliantaylor
Copy link
Contributor

fwiw I would prefer to revert back to the simpler version which I want in 1.9
the more fancy versino can be done for 1.10 after the reorg branch is merged. that also saves the header dance as dotblas goes away.

@larsmans
Copy link
Contributor Author

@juliantaylor I tried the ASCII conversion at first but it was actually more complicated than this. I suggest that I:

Do you guys all agree with a modified 818398df1e484dd9f9bb9a564466c1ca82bdcdc6 (so with the format_shapes function) that additionally points out the offending dimensions' indices? The rebased version would then not include that function, of course.

@juliantaylor
Copy link
Contributor

I'm for either everything as is (with warnings fixed) or the really simple one we had in the beginning that only updated the error message with no fancy shape printing.

I had to move convert_shape_to_string to common.h so that _dotblas.c
can access it.
@larsmans
Copy link
Contributor Author

Ok, I squashed the last two commits to get rid of the intermediate version and included npy_3kcompat.h to get rid of the warnings.

juliantaylor added a commit that referenced this pull request Aug 24, 2014
ENH: np.dot: better "matrices not aligned" message
@juliantaylor
Copy link
Contributor

thanks, pushed to 1.9, I saw you already opened a PR onto charris branch, so we don't need to merge this into master.

@larsmans larsmans deleted the dot-errmsg branch August 24, 2014 17:43
@larsmans
Copy link
Contributor Author

Thanks!

juliantaylor added a commit to juliantaylor/numpy that referenced this pull request Sep 4, 2014
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants
0