8000 XLA's Dot should follow broadcast semantics from np.matmul, not np.dot · Issue #5523 · tensorflow/tensorflow · GitHub
[go: up one dir, main page]

Skip to content

XLA's Dot should follow broadcast semantics from np.matmul, not np.dot #5523

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
shoyer opened this issue Nov 10, 2016 · 3 comments
Closed

XLA's Dot should follow broadcast semantics from np.matmul, not np.dot #5523

shoyer opened this issue Nov 10, 2016 · 3 comments
Assignees
Labels
type:feature Feature requests

Comments

@shoyer
Copy link
Contributor
shoyer commented Nov 10, 2016

I notice that the XLA Dot operation copies "outer-product style" broadcast semantics from numpy.dot:

Input Output Semantics
array [p x q x r] dot array [s x r x t] array [p x q x s x t] array dot product (read below)

In brief, I think this is a mistake. It would be better to follow the "matmul style" style broadcasting semantics of Python's @ operation and NumPy's matmul.

matmul's broadcasting is much more general, and in my opinion, also easier to understand. For example, it can do batch matrix-multiplication, but also can still do outer product style broadcasting if you insert dummy dimensions of length 1 (the axes do end up in a different order), e.g.,
batch matmul: [p x q x r] matmul [p x r x t] -> [p x q x t]
outer product matmul: [p x 1 x q x r] matmul [1 x s x r x t] -> [p x s x q x t]

If we could go back in time as NumPy developers, we assuredly would change dot to work this way (now we cannot, because of backwards compatibility concerns). So it would be nice to change this for XLA before we lock in this behavior.

@sherrym
Copy link
Contributor
sherrym commented Nov 10, 2016

@cwhipkey, @andydavis1 and @prb12 , could you please comment on this? Thanks.

@eliben
Copy link
Contributor
eliben commented Nov 10, 2016

@shoyer thanks for the suggestion -- we'll think about it

@eliben
Copy link
Contributor
eliben commented Nov 21, 2016

Based on this input and other considerations, we've decided to restrict the semantics of XLA's Dot operation to 1D and 2D arrays in the initial release. We may consider expanding it to higher dimensions in the future, and at that point we'll be considering different possible semantics. For the time being, however, this issue can be closed.

Thanks for the feedback!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type:feature Feature requests
Projects
None yet
Development

No branches or pull requests

5 participants
0