-
-
Notifications
You must be signed in to change notification settings - Fork 11.1k
A new PEP for infix matrix multiplication #4351
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
rejected by BDFL fiat, this would be using a sledgehammer to smash | ||
a fly. There is a strong consensus in the scientific python | ||
community that matrix multiplication really is the only missing | ||
infix operator that matters enough to bother about. (In |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would really like to see && and || for the element-wise logical operations; faking them with the bitwise operators and parentheses is ugly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The overloadable boolean ops PEP was recently rejected: https://mail.python.org/pipermail/python-dev/2012-March/117510.html
I think it's best to not divert attention to that topic, but just focus on the matmul operation.
👍 well written and I'd be for it (for what that's worth), plus it'd be a big incentive to move to Python 3 and reduce the 2/3 fragmentation. (even if the libraries themselves couldn't use those operators because of compatibility issues) |
Hey @njsmith, we had some discussions ages ago at SciPy'07 or '08, I did my best to summarize them here. That stuff is in an old bazaar repo, but you can get the reST sources here. I never really followed this up beyond summarizing the conversations in that document. Just mentioning it here in case any of that ancient history is useful, feel free to ignore it in light of the last 5-6 years' worth of experience :) |
BTW, there's a trivial makefile too that I was using to make rendered versions of the rst sources; it would be handy to have such a view somewhere, as it does make it much nicer to read. |
|
||
The main motivation for this PEP is the addition of a binary operator | ||
``@`` for matrix multiplication. No-one cares terribly much about the | ||
matrix power operator ``@@`` -- it's useful and well-defined, but not |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd make the discussion of @@
much less prominent.
It can be mentioned somewhere toward the end, but I think it should not be proposed in the PEP.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree. Including @@
in this PEP is a distraction which makes it marginally more likely to be rejected. In my experience, matrix powers with arbitrary bases are rare (usually the matrix exponential is preferred).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I moved the motivation of @@
to the end, but my feeling is that it can't hurt and might help. Remember, we're not trying to ask the BDFL for some favor -- we're helping him see the facts about how much more elegant and nice Python will become after making these changes. Suggesting that we have @
, *
, and **
but not @@
is actually asymmetric and inelegant.
And anyway, if he likes one half of a PEP but not the other than he'll just tell us to change it. It's still clear from the motivation that we'd be totally happy with just @
.
+1 from me. Nice work. Thanks for putting this together. |
This leaves out things like cross product and matrix division. What about using something like @ as a prefix for an operator to define it as a matrix operation? So @, @/, @__. You could then use @@ for the "reverse" version of that operator, so @@ would be cross product, @@/ would be back division. |
@toddrjen: the suggestion of using a prefix for multiple operators is similar to PEP-225, and it has not made progress. Also, I think cross product and matrix division are below in priority to matrix multiplication in necessity. |
Probably, but I would say matrix exponent is below all of those in importance. I think the argument for including @@ is applicable to other mathematical operators as well, and I would say they are far more useful. So I would either cut out @@, or better justify including it but not any of other operators. |
more often than ``//`` or other integer operations. | ||
|
||
|
||
But isn't it weird to add an operator with no stdlib uses? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is precedent for adding syntax to Python not used (to my knowledge) in the stdlib in the form of x[...]
which is equivalent to x[Ellipsis]
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, added a mention of that.
It will be helpful to include a survey of what other languages use. For example, there is a precedent for using |
@fperez You can view a |
Here's the link to the rendered file: https://github.com/njsmith/numpy/blob/matmul-pep/doc/neps/return-of-revenge-of-matmul-pep.rst |
When the going gets tough, the tough get empirical. To get a rough | ||
estimate of how useful the ``@`` operator will be, this table shows | ||
the rate at which different Python operators are used in the stdlib, | ||
and also in two high-profile numerical projects -- the sklearn machine |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's called scikit-learn, not sklearn (that's just the package name).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, fixed.
In addition to the APL, Kx, and R syntax given by @abalkin above, I skimmed some from Rosetta Code:
I'd say APL's use of non-ASCII characters was historically unpopular. IDL's hash clashes with Python's comment. J uses an ASCII reworking of APL, I think, so it's got a much more elaborate idea of combining multiple infix operators together; powerful but perhaps not really Pythonic. The dot of Mathematica and Maxima is nice but clashes with Python's object attribute/method/module accessor. Julia, Octave, Pari/GP, and TI-83 BASIC's star is nice but would be backward-incompatible with NumPy's elementwise multiplication; Ruby's star relies on the subclass idea eloquently rebutted by the original poster. R's and K's multicharacter things are pretty ugly. |
* The mATrices mnemonic is cute. | ||
* The swirly shape is reminiscent of the simultaneous sweeps over rows | ||
and columns that define matrix multiplication. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another mnemonic motivation for the at-sign: it slightly resembles the centred dot (LaTeX \cdot) often used in printed mathematics to represent an algebraic product.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, added.
I would like to see a more complete examination of np.Matrix before There are perhaps warts with np.Matrix, can these be identified and Colin W. On 15-Mar-2014 3:33 PM, Robert Bradshaw wrote:
|
@cjw43 There have been several attempts over the years. The PEP acknowledges that history without attempting to recapitulate it, proposal for proposal. The PEP is not the beginning of the examination of the problems with a two-type solution, but the culmination of many years of experience with them (yes, plural). The past threads on the subject can be found in the mailing list archive. By all means, please make concrete proposals if you have them. In the absence of concrete proposals, I think we have a reasonable case not to hold out hope that a good one will show up after all of these years. |
Every user of a library that hijacks these operators needs to be aware of both the hijacked and original meanings. Anybody who is used to a language where the power operator is written '^' gets painfully acquainted with Any new operator increases the total complexity of the language, takes valuable developer time to implement, uses up some memory in the interpreter, etc. There are costs. They are small but they affect everyone in the Python ecosystem. Given that the benefits will only be felt by a small subset of the Python community, they had better be significant. |
On Mar 19, 2014 5:13 AM, "Ronan Lamy" notifications@github.com wrote:
What specific confusions ate likely with this proposal? Yes, in a general sense that could be an issue with singer operators, but
First, I don't think the scientific community is a small subset, I think it Second, you need to demonstrate that these changes will be anything other Third, in terms of developer time, you can't just look at the time needed |
@rlamy I think that it's worth noting that no one on python-ideas, except perhaps yourself, has objected to the PEP on these grounds or asked us to do more soul searching about whether we really need the operator. These are people most of whom will never use matrix multiplication. It's nice to be concerned for them, but they can speak for themselves and have spoken in favor of the PEP. |
@toddrjen Implementing |
@pchanial they are suggesting left precedence and asking for compelling arguments against it on the mailing list |
I've just added text for "same-left" associativity/precedence: |
converting back and forth all the time, is incredibly cumbersome and | ||
impossible to get right at any scale. Functions that defensively try | ||
to handle both types as input and DTRT, find themselves floundering | ||
into a swamp of ``isinstance`` and ``if`` statements. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually handling both types as input is not so hard if you do it like scikit-learn, were matrices may come in, but never come out. The trouble starts when you also produce np.matrix when you get it as input.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The point is still that libraries have to be aware of this double-class
convention, otherwise only the broken sub-class solutions works (or
doesn't).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Dealing with matrix
in this fashion is not really an additional overhead to allowing lists or other sequences as inputs: you just always use np.asarray()
on the inputs. If everyone did this, though, matrix
would be useless since it would disappear whenever you passed one to a function. scikit-learn accepts this; it doesn't care about supporting matrix
users except to transparently consume their data. It also doesn't have many functions that one would really use in expressions, so it's not much loss for matrix
users. But there are a lot of other functions that do need to output matrix
objects to be useful to matrix
users, and this is difficult so few people do it right. That's why the status quo sucks. You can choose between having an operator for matrix multiplication by using the matrix
type or having the full range of the community's code available to you. You can't have both. And just to be clear, there is nothing about this situation that depends on the implementation details of numpy.matrix
. Subclass or not, all of these issues remain as long as you have two types that you want to interoperate.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@rkern I agree with you. What I was really trying to point out is that DTRT is ambiguous in this context, because to a new user there may seem to be two "obvious" ways of "doing it right".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Point.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I see. That sentence was meant as an alternative to the previous one,
i.e., the first point is that keeping track of which functions expect what
is impossible, so the obvious alternative is to write functions in such a
way that you don't need to keep track, but that is also impossible...
On Fri, Apr 11, 2014 at 1:27 PM, Lars Buitinck notifications@github.comwrote:
In doc/neps/return-of-revenge-of-matmul-pep.rst:
+possible to switch between the conventions, because numpy provides two
+different types with different__mul__
methods. For
+numpy.ndarray
objects,*
performs elementwise multiplication,
+and matrix multiplication must use a function call (numpy.dot
).
+Fornumpy.matrix
objects,*
performs matrix multiplication,
+and elementwise multiplication requires function syntax. Writing code
+usingnumpy.ndarray
works fine. Writing code using
+numpy.matrix
also works fine. But trouble begins as soon as we
+try to integrate these two pieces of code together. Code that expects
+anndarray
and gets amatrix
, or vice-versa, may crash or
+return incorrect results. Keeping track of which functions expect
+which types as inputs, and return which types as outputs, and then
+converting back and forth all the time, is incredibly cumbersome and
+impossible to get right at any scale. Functions that defensively try
+to handle both types as input and DTRT, find themselves floundering
+into a swamp ofisinstance
andif
statements.@rkern https://github.com/rkern I agree with you. What I was really
trying to point out is that DTRT is ambiguous in this context, because to a
new user there may seem to be two "obvious" ways of "doing it right".—
Reply to this email directly or view it on GitHubhttps://github.com//pull/4351/files#r11529482
.
Nathaniel J. Smith
Postdoctoral researcher - Informatics - University of Edinburgh
http://vorpus.org
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On 11-Apr-2014 7:39 AM, Stefan van der Walt wrote:
In doc/neps/return-of-revenge-of-matmul-pep.rst:
> +possible to switch between the conventions, because numpy provides two
+different types with different
__mul__
methods. For
+numpy.ndarray
objects,*
performs elementwise multiplication,
+and matrix multiplication must use a function call (numpy.dot
).
+Fornumpy.matrix
objects,*
performs matrix multiplication,
+and elementwise multiplication requires function syntax. Writing code
+usingnumpy.ndarray
works fine. Writing code using
+numpy.matrix
also works fine. But trouble begins as soon as we
+try to integrate these two pieces of code together. Code that expects
+anndarray
and gets amatrix
, or vice-versa, may crash or
+return incorrect results. Keeping track of which functions expect
+which types as inputs, and return which types as outputs, and then
+converting back and forth all the time, is incredibly cumbersome and
+impossible to get right at any scale. Functions that defensively try
+to handle both types as input and DTRT, find themselves floundering
+into a swamp ofisinstance
andif
statements.
The point is still that libraries have
to be aware of this double-class
convention, otherwise only the broken sub-class solutions works
(or
doesn't).
Stefan,
Thanks for this clarification.
You say "Code that expects an ``ndarray`` and gets a ``matrix``, or
vice-versa, may crash or return incorrect
results.
"
I would look at this as an error in the existing code. Would it not
be simpler to correct this than construct code for a new operator?
Has this been explored? It's been a long time since I looked at the
code.
It would help if some example of the need for "converting back and
forth" could be given
Colin W.
—
Reply to this email directly or view
it on GitHub.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On Fri, Apr 11, 2014 at 2:43 PM, Colin J. Williams <notifications@github.com
wrote:
Stefan, Thanks for this clarification. You say "Code that expects an
ndarray
and gets amatrix
, or vice-versa, may crash or return
incorrect results. " I would look at this as an error in the existing
code. Would it not be simpler to correct this than construct code for a
new operator? Has this been explored?
No, it wouldn't be simpler. Fixing every piece of code to handle both types
as inputs and outputs means that every function must have special checks,
which creates a combinatorial explosion in code complexity and is a serious
obstacle to maintainability, testing, etc.
This is why if you look at actual code there are few (possibly zero)
libraries that actually handle mixed ndarray and matrix objects correctly
(even numpy itself doesn't in many cases) -- it's just not doable. Everyone
prefers to use ndarray objects with the cumbersome dot() function call
syntax for matrix multiplication. It's annoying, but it's still better than
checking that every single code path can handle arrays and matrices (and
combinations thereof etc). (In fact, there are some places where
scikit-learn is forced to handle matrix objects because scipy.sparse only
provides a matrix-style API, and even in these cases they don't actually
use '*' for matrix multiplication -- which is the point that started this
whole thread. Instead they have a special safe_sparse_dot() function which
provides a dot()-style API that works for both arrays and matrices.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@cjw43 Here's a library's code snippet for squaring the elements of an input array:
def square_it(A):
return A * A
That code is now broken, even though the library author has never even heard of a Matrix class. The new, correct code, becomes:
def square_it(A):
A = np.asarray(A)
return A * A
That's not a problem as long as all consumers out there are aware of this possibility, but you are forcing a convention on programmers you don't control, and one which is auxiliary their intention. Contrast that to the following library code:
def square_it(A):
return A * A
We know what this does. It squares the elements of the array.
def square_it(A):
return A @ A
We know what this does, it performs a dot product between the two matrices.
This is probably a day late and a dollar short (as my Mom used to say) but has any serious consideration ever been given to defining dot (".") as an inner product operator? That's the way that APL does it, but every time it's brought up, it is dismissed out of hand. Yes, it would give an additional, different, meaning to ".", but "%" is used for very different purposes without too much confusion. Uisng the string "+._" in Python code is a syntax error, so we don't have to worry about breaking pre-existing code. I envision A+._B as syntactic sugar for
where inner_product is defined (for simple sequence objects) as equivalent to this:
The compiler would be allowed to instantiate +.* only once per sequence type, no matter how many times it appeared in the code, however non-standard sequence types (such as numpy's ndarrays) would be able to provide alternative implementations which would only be used by those types. (I envision a dict attached to the PySequenceMethods struct that would cache, for example, inner_product(operator.add, operator.mul) under the key '"+.*") |
@samwyse, I am not sure how this could be an option. |
Hi Sam, I think the challenge you'd face if you tried proposing this to -n On Mon, Apr 21, 2014 at 5:39 PM, Sam Denton notifications@github.comwrote:
Nathaniel J. Smith |
@seberg: I'm proposing "." as a meta-operator, which would not be used by itself. That particular character is currently used to access attributes, and also to separate the integral and fractional parts of floating point numbers. This would be a third use. |
@njsmith: Way back when I learned APL, I recall lots of ways that f,g could be used besides matrix multiplication. For example, '<.&' would tell you if every number in a list was smaller that the corresponding number in another list. That said, my use of '&' in my example (as a logical and) has different semantics from Python's definition ('&' is bitwis F438 e and). Really, you'd want to use '<.and' which I suspect would really only be parsable because of 'and' being a keyword; and yes, Python already allows 'all(x < y for x, y in zip(a,b))' which means the same thing (albeit not as concisely). Iverson provides several examples, but they tend to use APL's 'max' and 'min' operators, which in Python aren't even keywords, and so would look to the compiler a lot like attribute access; I suspect that a proposal to implement GCC's minimum and maximum operators ('<?' and '>?') would be met with cat-calls. |
I think a relevant data point here is that back in the day, Guido was On Mon, Apr 21, 2014 at 8:17 PM, Sam Denton notifications@github.comwrote:
Nathaniel J. Smith |
|
@cjw43 There is no "destruction of the Class" to be discussed in the PEP. If you are talking about the possible deprecation of the Please, please, please use the Github interface for commenting or else trim your email replies to just your content. But as the PEP has already been accepted, continuing to press this particular point does not appear to be a good use of anyone's time. The decisionmakers in the core Python dev team consider the amount of discussion in the PEP as it stands perfectly adequate. |
@njsmith Ready to merge? |
Oh right, this PR. Sure, I guess that makes sense to save the history, though I suppose the
|
A new PEP for infix matrix multiplication
History saved! Thanks for that. |
Super late to the party, but I'd like to cross-reference JuliaLang/LinearAlgebra.jl#42 here, where it looks like very similar discussions have taken place in the context of how to define vector transposition. |
Re the use of a potential @@ operator. In a lot of mathematical contexts, we actually consider exp(A) for some matrix A. exp(A)=I + A + A@@2/2! + A@@3/3! + A@@4/4! + .... continuing like the Taylor Series. Other useful contexts for the @@ operator exist, but it is definitely specifically for math people and may be noise to most non-scientific programmers. |
On 30 January 2016 at 17:12, owendavey notifications@github.com wrote:
In computer graphics, one multiplies matrices a lot. Considering that this |
@dimpase - I was not referring to matrix multiplication. I was referring to matrix exponentiation - hence the double @ operator which above seems to have been considered not important enough to be added in addition to the single @ operator of matrix multiplication for this PEP. |
The discussion here has ended. The PEP has been accepted and implemented. If you would like to propose or discuss the merits of further changes, please find an appropriate mailing list. Thanks. |
This is now PEP 465, and a properly rendered version can be found here here.
A poorly rendered version of the latest draft can be found here.
Let's use this PR as a central place to discuss this draft PEP for adding an infix matrix multiplication operator to numpy. The PR format is nice in that it allows line by line comments, updating, etc.
Hopefully this goes without saying, but just in case: We especially welcome comments about how the proposal will work for non-numpy projects (or could work, with fixes); this proposal started in the numpy community but the idea here is to build a consensus about how to make Python better for all projects that care about linear algebra.
Some possible points of discussion: