Quick and dirty matmul #5878

charris · 2015-05-15T05:12:16Z

Adds __matmul__ and __rmatmul__. This is a work in progress and currently lacks __imatmul__ and tests. The methods seem to work, but python3.5 fails to call them for the @ operator. I suspect this may be related to ndarray being a class defined in C, as a subclass with the operators added does seem to work.

The methods use einsum for the stacked case and types not available in cblas, consequently they do not work for object arrays. However, they are faster for integer arrays of non-trivial size than the current dot.

njsmith · 2015-05-15T05:19:28Z

Yeah, for a class defined in C you have to fill in the new nb_matrix_multiply and nb_inplace_matrix_multiply methods that were added to the end of the PyNumberMethods struct:
https://docs.python.org/3.5/c-api/typeobj.html#c.PyNumberMethods

charris · 2015-05-15T05:25:44Z

@njsmith Thought so, but I couldn't find it in a quick grep through the Python source. Now I need to worry about left and right and NotImplented. Ugh.

charris · 2015-05-15T05:26:29Z

And it means it need to be tested with Python3.5.

njsmith · 2015-05-15T05:53:05Z

numpy/core/src/multiarray/multiarraymodule.c

+    }
+
+    errval = PyUFunc_CheckOverride(cached_npy_matmul, "__call__",
+                                   args, kwds, &override, 2);


I feel like PyArray_Matmul should also do the override check, i.e. it should have identical semantics to the matmul function?

That raises the question if it should be in the C-API at this point. I'm tempted to leave it out and just go with the matmul method.

I was also feeling a little hesitant about adding it to the C api, so +1 on
waiting until we're more sure what we're doing here.
On May 14, 2015 11:32 PM, "Charles Harris" notifications@github.com wrote:

In numpy/core/src/multiarray/multiarraymodule.c
#5878 (comment):

PyObject *v, *a, *o = NULL;

char* kwlist[] = {"a", "b", "out", NULL };

if (cached_npy_matmul == NULL) {

PyObject *module, *dict, *matmul;

module = PyImport_ImportModule("numpy.core.multiarray");

dict = PyModule_GetDict(module);

matmul = PyDict_GetItemString(dict, "matmul");

cached_npy_matmul = (PyUFuncObject*)matmul;

Py_INCREF(cached_npy_matmul);

Py_DECREF(module);

}

errval = PyUFunc_CheckOverride(cached_npy_matmul, "call",

args, kwds, &override, 2);

That raises the question if it should be in the C-API at this point. I'm
tempted to leave it out and just go with the matmul method.

—
Reply to this email directly or view it on GitHub
https://github.com/numpy/numpy/pull/5878/files#r30387829.

charris · 2015-05-15T06:32:39Z

Works in Python3.5 now.

>>> import numpy as np
>>> x = np.ones(3)
>>> m = np.eye(3)
>>> x @ x
3.0
>>> x @ [1,2,3]
6.0
>>> [1,2,3] @ x
6.0
>>> m @ x
array([ 1.,  1.,  1.])
>>> x @ m
array([ 1.,  1.,  1.])
>>> x @ 1
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: Scalar operands are not allowed, use '*' instead

njsmith · 2015-05-15T06:37:59Z

Woohoo!

Should probably be a ValueError I guess?
On May 14, 2015 11:32 PM, "Charles Harris" notifications@github.com wrote:

Works in Python3.5 now.

import numpy as np
x = np.ones(3)
m = np.eye(3)
x @ x
3.0
x @ [1,2,3]
6.0
[1,2,3] @ x
6.0
m @ x
array([ 1., 1., 1.])
x @ m
array([ 1., 1., 1.])
x @ 1
Traceback (most recent call last):
File "", line 1, in
TypeError: Scalar operands are not allowed, use '*' instead

—
Reply to this email directly or view it on GitHub
#5878 (comment).

charris · 2015-05-15T06:45:01Z

ValueError used.

charris · 2015-05-15T07:03:28Z

Seems to work with stacks as it should.

>>> mm = np.stack([m]*2, axis=0)
>>> mm @ mm
array([[[ 1.,  0.,  0.],
        [ 0.,  1.,  0.],
        [ 0.,  0.,  1.]],

       [[ 1.,  0.,  0.],
        [ 0.,  1.,  0.],
        [ 0.,  0.,  1.]]])

pv · 2015-05-15T13:51:24Z

Can you remind why this cannot be implemented as a gufunc, which would get __numpy_ufunc__ and the outer iterators for free? I guess this has to do with 1d vector issues?

charris · 2015-05-15T14:01:18Z

@pv It can be, I have an old branch that makes a start on that, which is why this is quick and dirty ;)

njsmith · 2015-05-18T07:01:48Z

I don't object to shipping something quick-and-dirty in 1.10, but I'll note
in case anyone is interested that from a quick look it doesn't seem like
it'd be that hard to enhance gufuncs to support "i?j,jk?->i?k?" syntax I
suggested in some mailing list message somewhere (where "?" means "this
axis is optional -- if not present in the input, treat it as 1, and then
remove it from the output". The gufunc code actually has some checks we
just added since 1.9 that have no purpose except to prevent it from
treating missing axes as if they had size 1, so making this work would
mostly be a matter of bookkeeping -- adding the ? syntax to the signature
parser, making a note somewhere about which dimensions were marked as
optional, etc.

On Fri, May 15, 2015 at 7:01 AM, Charles Harris notifications@github.com
wrote:

@pv https://github.com/pv It can be, I have an old branch that makes a
start on that, which is why this is quick and dirty ;)

—
Reply to this email directly or view it on GitHub
#5878 (comment).

Nathaniel J. Smith -- http://vorpus.org

jaimefrio · 2015-05-18T07:20:50Z

To keep track of the optional dimensions we need to add more items to the PyUfuncObject struct. And to do that without breaking ABI compatibility we need to hide away its contents, I guess with something similar to the PyArrayObject_fields one. At some point in time @juliantaylor mentioned he was working on the hiding, and I put that work in the backburner until then.

njsmith · 2015-05-18T07:53:36Z

While hiding it away is doubtless a good idea, we're already adding new items to PyArray_Descr in 1.10 (see #5343), so I don't know that it matters much if we add new items to another struct as well. Plus even in the worst case this only breaks forward compatibility, not backward, and Ralf says that we outright enforce lack-of-forward-compatibility regardless.

charris · 2015-05-18T07:59:42Z

I actually wrote four gufuncs for this last year, together with some changes to the ufunc generator to make them easier to declare. Ideally BLAS needs to always be available, so we really need the blas_lite fallback position and for that I would want to move the code from the linalg package to somewhere common. I also don't like the mutual dependence of the multiarray and ufunc modules, but can live with it.

This PR provides tests and a chance for people to fool with the operator and provide feedback, I don't think Python 3.5 will be standard for another year or so. One problem I've run into is that the operator overrides aren't working as they should, although the matmul function does just fine. I was looking into that when a big chunk of my cpu memory crapped out, so until I put in new memory and get hold the work sitting on disk I'm a bit crippled, probably until Wednesday or so if I'm lucky.

njsmith · 2015-05-18T08:07:24Z

The problem is that it's pretty fragile to try patching this together out
of multiple gufuncs for the different shapes, because there is stuff we are
supposed to do on the way into the gufunc (e.g., numpy_ufunc
delegation) before we can safely cast the inputs to become arrays, and
thus before we know what shape the array is and which gufunc to dispatch
to. There is some extremely awkward code in np.linalg.solve to try and work
around this already. Teaching the gufunc code how to handle this directly
would simplify everything a lot.
.
Concur on the mutual dependence between umath and multiarray being nasty.
In the long run I still think we should merge them into one library, but in
the mean time... Is there anything in multiarray that actually needs blas,
though, besides dot? We could just move the blas dependency entirely to
umath, no? Instead of having dot be handled through the dtype ArrFuncs, we
could just use regular ufunc loop dispatch? And then we'd still need a way
for multiarray to get a handle to the dot gufunc, but that doesn't require
any new kluges, it can use the same hack that multiarray uses to get a
handle to the add ufunc for +.
.
Not sure what you mean about operator overrides, but note that how they
ought to work is under heavy discussion right now in #5844 so depending
on what the issue is you might not want to worry about it until that's
resolved to worry about it...

On Mon, May 18, 2015 at 12:59 AM, Charles Harris notifications@github.com
wrote:

I actually wrote four gufuncs for this last year, together with some
changes to the ufunc generator to make them easier to declare. Ideally BLAS
needs to always be available, so we really need the blas_lite fallback
position and for that I would want to move the code from the linalg package
to somewhere common. I also don't like the mutual dependence of the
multiarray and ufunc modules, but can live with it.

This PR provides tests and a chance for people to fool with the operator
and provide feedback, I don't think Python 3.5 will be standard for another
year or so. One problem I've run into is that the operator overrides aren't
working as they should, although the matmul function does just fine. I was
looking into that when a big chunk of my cpu memory crapped out, so until I
put in new memory and get hold the work sitting on disk I'm a bit crippled,
probably until Wednesday or so if I'm lucky.

—
Reply to this email directly or view it on GitHub
#5878 (comment).

Nathaniel J. Smith -- http://vorpus.org

charris · 2015-05-31T18:43:34Z

I think this is ready to go pending tests. The override mechanism is the same as currently implemented for other functions, we can fix it up when the issue is decided and a transition path laid out.

charris · 2015-05-31T23:20:37Z

Take that back, there are still some problems with the operator.

charris · 2015-06-01T04:54:16Z

Should all be fixed now.

charris · 2015-06-02T21:38:00Z

I'm going to merge this in a day or two if noone beets me to it and there are no strenuous objections.

The function was not static, which led to multiple definition errors.

This is the functional counterpart of the '@' operator that will be available in Python 3.5 with the addition of an out keyword. It operates like the dot function except that - scalar multiplication is not allowed. - multiplication of arrays with more than 2 dimensions broadcasts. The last means that when arrays have more than 2 dimensions they are treated as stacks of matrices and those stacks are broadcast against each other unlike the current behavior of dot that does an outer product. Like dot, matmul is aware of `__numpy_ufunc__` and can be overridden. The current version of the function uses einsum when cblas cannot be used, hence object arrays are not yet supported.

Start labels at beginning of line. Break a long line Whitespace.

Document the matmul function and add '@' to the operator section of the reference manual.

Quick and dirty matmul

charris added 01 - Enhancement component: numpy._core labels May 15, 2015

charris added this to the 1.10.0 release milestone May 15, 2015

charris force-pushed the quick-and-dirty-matmul branch from 0c9e8fd to 8b52f96 Compare May 15, 2015 05:13

charris mentioned this pull request May 15, 2015

ENH: Implement matmul function #4464

Closed

njsmith reviewed May 15, 2015
View reviewed changes

charris force-pushed the quick-and-dirty-matmul branch from 8b52f96 to 8cda72b Compare May 15, 2015 06:28

charris force-pushed the quick-and-dirty-matmul branch from 8cda72b to f347bae Compare May 15, 2015 06:44

charris force-pushed the quick-and-dirty-matmul branch from f347bae to f981ef2 Compare May 15, 2015 07:04

charris force-pushed the quick-and-dirty-matmul branch from f981ef2 to 70134f8 Compare May 15, 2015 17:37

charris force-pushed the quick-and-dirty-matmul branch from 70134f8 to 9f6cf44 Compare May 22, 2015 03:50

charris force-pushed the quick-and-dirty-matmul branch 3 times, most recently from 9264b19 to 020a156 Compare May 31, 2015 18:35

charris force-pushed the quick-and-dirty-matmul branch from 020a156 to aafe932 Compare June 1, 2015 04:43

charris force-pushed the quick-and-dirty-matmul branch from aafe932 to 8e8b3b3 Compare June 2, 2015 17:35

sinhrks mentioned this pull request Jun 3, 2015

API: Support Matrix Multiplication Ops (@) pandas-dev/pandas#10259

Closed

charris added 7 commits June 4, 2015 18:00

BUG: Fix npy_cache_pyfunc to properly implement inline.

8c18581

The function was not static, which led to multiple definition errors.

ENH: Add support for the '@' operator added in Python 3.5.

1b6193b

TST: Add tests for multiarray.matmul and operator '@'.

0cb4cbd

STY: Minor C style fixes for multiarraymodule.c.

c79651d

Start labels at beginning of line. Break a long line Whitespace.

DOC: Document '@' and matmul.

91e8cd2

Document the matmul function and add '@' to the operator section of the reference manual.

DOC: Mention support for the '@' operator in the 1.10 release notes.

24fcc25

charris force-pushed the quick-and-dirty-matmul branch from 8e8b3b3 to 24fcc25 Compare June 5, 2015 00:02

charris added a commit that referenced this pull request Jun 5, 2015

Merge pull request #5878 from charris/quick-and-dirty-matmul

30d755d

Quick and dirty matmul

charris merged commit 30d755d into numpy:master Jun 5, 2015

charris deleted the quick-and-dirty-matmul branch June 5, 2015 00:18

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Quick and dirty matmul #5878

Quick and dirty matmul #5878

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Quick and dirty matmul #5878

Quick and dirty matmul #5878

Uh oh!

Conversation

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!