8000 Quick and dirty matmul by charris · Pull Request #5878 · numpy/numpy · GitHub
[go: up one dir, main page]

Skip to content

Quick and dirty matmul #5878

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from
Jun 5, 2015
Merged

Conversation

charris
Copy link
Member
@charris charris commented May 15, 2015

Adds __matmul__ and __rmatmul__. This is a work in progress and currently lacks __imatmul__ and tests. The methods seem to work, but python3.5 fails to call them for the @ operator. I suspect this may be related to ndarray being a class defined in C, as a subclass with the operators added does seem to work.

The methods use einsum for the stacked case and types not available in cblas, consequently they do not work for object arrays. However, they are faster for integer arrays of non-trivial size than the current dot.

@charris charris added this to the 1.10.0 release milestone May 15, 2015
@charris charris force-pushed the quick-and-dirty-matmul branch from 0c9e8fd to 8b52f96 Compare May 15, 2015 05:13
@njsmith
Copy link
Member
njsmith commented May 15, 2015

Yeah, for a class defined in C you have to fill in the new nb_matrix_multiply and nb_inplace_matrix_multiply methods that were added to the end of the PyNumberMethods struct:
https://docs.python.org/3.5/c-api/typeobj.html#c.PyNumberMethods

@charris
Copy link
Member Author
charris commented May 15, 2015

@njsmith Thought so, but I couldn't find it in a quick grep through the Python source. Now I need to worry about left and right and NotImplented. Ugh.

@charris
Copy link
Member Author
charris commented May 15, 2015

And it means it need to be tested with Python3.5.

}

errval = PyUFunc_CheckOverride(cached_npy_matmul, "__call__",
args, kwds, &override, 2);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like PyArray_Matmul should also do the override check, i.e. it should have identical semantics to the matmul function?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That raises the question if it should be in the C-API at this point. I'm tempted to leave it out and just go with the matmul method.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was also feeling a little hesitant about adding it to the C api, so +1 on
waiting until we're more sure what we're doing here.
On May 14, 2015 11:32 PM, "Charles Harris" notifications@github.com wrote:

In numpy/core/src/multiarray/multiarraymodule.c
#5878 (comment):

  • PyObject *v, *a, *o = NULL;
  • char* kwlist[] = {"a", "b", "out", NULL };
  • if (cached_npy_matmul == NULL) {
  •    PyObject *module, *dict, *matmul;
    
  •    module = PyImport_ImportModule("numpy.core.multiarray");
    
  •    dict = PyModule_GetDict(module);
    
  •    matmul = PyDict_GetItemString(dict, "matmul");
    
  •    cached_npy_matmul = (PyUFuncObject*)matmul;
    
  •    Py_INCREF(cached_npy_matmul);
    
  •    Py_DECREF(module);
    
  • }
  • errval = PyUFunc_CheckOverride(cached_npy_matmul, "call",
  •                               args, kwds, &override, 2);
    

That raises the question if it should be in the C-API at this point. I'm
tempted to leave it out and just go with the matmul method.


Reply to this email directly or view it on GitHub
https://github.com/numpy/numpy/pull/5878/files#r30387829.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed.

@charris charris force-pushed the quick-and-dirty-matmul branch from 8b52f96 to 8cda72b Compare May 15, 2015 06:28
@charris
Copy link
Member Author
charris commented May 15, 2015

Works in Python3.5 now.

>>> import numpy as np
>>> x = np.ones(3)
>>> m = np.eye(3)
>>> x @ x
3.0
>>> x @ [1,2,3]
6.0
>>> [1,2,3] @ x
6.0
>>> m @ x
array([ 1.,  1.,  1.])
>>> x @ m
array([ 1.,  1.,  1.])
>>> x @ 1
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: Scalar operands are not allowed, use '*' instead

@njsmith
Copy link
8000
Member
njsmith commented May 15, 2015

Woohoo!

Should probably be a ValueError I guess?
On May 14, 2015 11:32 PM, "Charles Harris" notifications@github.com wrote:

Works in Python3.5 now.

import numpy as np
x = np.ones(3)
m = np.eye(3)
x @ x
3.0
x @ [1,2,3]
6.0
[1,2,3] @ x
6.0
m @ x
array([ 1., 1., 1.])
x @ m
array([ 1., 1., 1.])
x @ 1
Traceback (most recent call last):
File "", line 1, in
TypeError: Scalar operands are not allowed, use '*' instead


Reply to this email directly or view it on GitHub
#5878 (comment).

@charris charris force-pushed the quick-and-dirty-matmul branch from 8cda72b to f347bae Compare May 15, 2015 06:44
@charris
Copy link
Member Author
charris commented May 15, 2015

ValueError used.

@charris
Copy link
Member Author
charris commented May 15, 2015

Seems to work with stacks as it should.

>>> mm = np.stack([m]*2, axis=0)
>>> mm @ mm
array([[[ 1.,  0.,  0.],
        [ 0.,  1.,  0.],
        [ 0.,  0.,  1.]],

       [[ 1.,  0.,  0.],
        [ 0.,  1.,  0.],
        [ 0.,  0.,  1.]]])

@charris charris force-pushed the quick-and-dirty-matmul branch from f347bae to f981ef2 Compare May 15, 2015 07:04
@pv
Copy link
Member
pv commented May 15, 2015

Can you remind why this cannot be implemented as a gufunc, which would get __numpy_ufunc__ and the outer iterators for free? I guess this has to do with 1d vector issues?

@charris
Copy link
Member Author
charris commented May 15, 2015

@pv It can be, I have an old branch that makes a start on that, which is why this is quick and dirty ;)

@charris charris force-pushed the quick-and-dirty-matmul branch from f981ef2 to 70134f8 Compare May 15, 2015 17:37
@njsmith
Copy link
Member
njsmith commented May 18, 2015

I don't object to shipping something quick-and-dirty in 1.10, but I'll note
in case anyone is interested that from a quick look it doesn't seem like
it'd be that hard to enhance gufuncs to support "i?j,jk?->i?k?" syntax I
suggested in some mailing list message somewhere (where "?" means "this
axis is optional -- if not present in the input, treat it as 1, and then
remove it from the output". The gufunc code actually has some checks we
just added since 1.9 that have no purpose except to prevent it from
treating missing axes as if they had size 1, so making this work would
mostly be a matter of bookkeeping -- adding the ? syntax to the signature
parser, making a note somewhere about which dimensions were marked as
optional, etc.

On Fri, May 15, 2015 at 7:01 AM, Charles Harris notifications@github.com
wrote:

@pv https://github.com/pv It can be, I have an old branch that makes a
start on that, which is why this is quick and dirty ;)


Reply to this email directly or view it on GitHub
#5878 (comment).

Nathaniel J. Smith -- http://vorpus.org

@jaimefrio
Copy link
Member

To kee 8000 p track of the optional dimensions we need to add more items to the PyUfuncObject struct. And to do that without breaking ABI compatibility we need to hide away its contents, I guess with something similar to the PyArrayObject_fields one. At some point in time @juliantaylor mentioned he was working on the hiding, and I put that work in the backburner until then.

@njsmith
Copy link
Member
njsmith commented May 18, 2015

While hiding it away is doubtless a good idea, we're already adding new items to PyArray_Descr in 1.10 (see #5343), so I don't know that it matters much if we add new items to another struct as well. Plus even in the worst case this only breaks forward compatibility, not backward, and Ralf says that we outright enforce lack-of-forward-compatibility regardless.

@charris
Copy link
Member Author
charris commented May 18, 2015

I actually wrote four gufuncs for this last year, together with some changes to the ufunc generator to make them easier to declare. Ideally BLAS needs to always be available, so we really need the blas_lite fallback position and for that I would want to move the code from the linalg package to somewhere common. I also don't like the mutual dependence of the multiarray and ufunc modules, but can live with it.

This PR provides tests and a chance for people to fool with the operator and provide feedback, I don't think Python 3.5 will be standard for another year or so. One problem I've run into is that the operator overrides aren't working as they should, although the matmul function does just fine. I was looking into that when a big chunk of my cpu memory crapped out, so until I put in new memory and get hold the work sitting on disk I'm a bit crippled, probably until Wednesday or so if I'm lucky.

@njsmith
Copy link
Member
njsmith commented May 18, 2015

The problem is that it's pretty fragile to try patching this together out
of multiple gufuncs for the different shapes, because there is stuff we are
supposed to do on the way into the gufunc (e.g., numpy_ufunc
delegation) before we can safely cast the inputs to become arrays, and
thus before we know what shape the array is and which gufunc to dispatch
to. There is some extremely awkward code in np.linalg.solve to try and work
around this already. Teaching the gufunc code how to handle this directly
would simplify everything a lot.
.
Concur on the mutual dependence between umath and multiarray being nasty.
In the long run I still think we should merge them into one library, but in
the mean time... Is there anything in multiarray that actually needs blas,
though, besides dot? We could just move the blas dependency entirely to
umath, no? Instead of having dot be handled through the dtype ArrFuncs, we
could just use regular ufunc loop dispatch? And then we'd still need a way
for multiarray to get a handle to the dot gufunc, but that doesn't require
any new kluges, it can use the same hack that multiarray uses to get a
handle to the add ufunc for +.
.
Not sure what you mean about operator overrides, but note that how they
ought to work is under heavy discussion right now in #5844 so depending
on what the issue is you might not want to worry about it until that's
resolved to worry about it...

On Mon, May 18, 2015 at 12:59 AM, Charles Harris notifications@github.com
wrote:

I actually wrote four gufuncs for this last year, together with some
changes to the ufunc generator to make them easier to declare. Ideally BLAS
needs to always be available, so we really need the blas_lite fallback
position and for that I would want to move the code from the linalg package
to somewhere common. I also don't like the mutual dependence of the
multiarray and ufunc modules, but can live with it.

This PR provides tests and a chance for people to fool with the operator
and provide feedback, I don't think Python 3.5 will be standard for another
year or so. One problem I've run into is that the operator overrides aren't
working as they should, although the matmul function does just fine. I was
looking into that when a big chunk of my cpu memory crapped out, so until I
put in new memory and get hold the work sitting on disk I'm a bit crippled,
probably until Wednesday or so if I'm lucky.


Reply to this email directly or view it on GitHub
#5878 (comment).

Nathaniel J. Smith -- http://vorpus.org

@charris charris force-pushed the quick-and-dirty-matmul branch from 70134f8 to 9f6cf44 Compare May 22, 2015 03:50
@charris charris force-pushed the quick-and-dirty-matmul branch 3 times, most recently from 9264b19 to 020a156 Compare May 31, 2015 18:35
@charris
Copy link
Member Author
charris commented May 31, 2015

I think this is ready to go pending tests. The override mechanism is the same as currently implemented for other functions, we can fix it up when the issue is decided and a transition path laid out.

@charris
Copy link
Member Author
charris commented May 31, 2015

Take that back, there are still some problems with the operator.

@charris charris force-pushed the quick-and-dirty-matmul branch from 020a156 to aafe932 Compare June 1, 2015 04:43
@charris
Copy link
Member Author
charris commented Jun 1, 2015

Should all be fixed now.

@charris charris force-pushed the quick-and-dirty-matmul branch from aafe932 to 8e8b3b3 Compare June 2, 2015 17:35
@charris
Copy link
Member Author
charris commented Jun 2, 2015

I'm going to merge this in a day or two if noone beets me to it and there are no strenuous objections.

charris added 7 commits June 4, 2015 18:00
The function was not static, which led to multiple definition
errors.
This is the functional counterpart of the '@' operator that will be
available in Python 3.5 with the addition of an out keyword. It
operates like the dot function except that

- scalar multiplication is not allowed.
- multiplication of arrays with more than 2 dimensions broadcasts.

The last means that when arrays have more than 2 dimensions they are
treated as stacks of matrices and those stacks are broadcast against
each other unlike the current behavior of dot that does an outer
product. Like dot, matmul is aware of `__numpy_ufunc__` and can be
overridden.

The current version of the function uses einsum when cblas cannot be
used, hence object arrays are not yet supported.
Start labels at beginning of line.
Break a long line
Whitespace.
Document the matmul function and add '@' to the operator
section of the reference manual.
@charris charris force-pushed the quick-and-dirty-matmul branch from 8e8b3b3 to 24fcc25 Compare June 5, 2015 00:02
charris added a commit that referenced this pull request Jun 5, 2015
@charris charris merged commit 30d755d into numpy:master Jun 5, 2015
@charris charris deleted the quick-and-dirty-matmul branch June 5, 2015 00:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants
0