-
-
Notifications
You must be signed in to change notification settings - Fork 10.9k
DEP: Deprecate boolean math operations #4105
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Best way to find out how big the impact is is to run a few other projects
|
Does this affect |
@rkern if the sum dtype is boolean
|
Sum and prod are fine, since they get the output type set to an integer dtype, which means the EDIT: Your second example would be deprecated. |
The current behavior of + and * for Boolean arrays are correct and should not be changed. These are the standard meet and join operations for Boolean arrays, including Boolean matrices. Please research the literature on Boolean matrices before considering such a change. |
According to wikipedia, "Frequently operations on binary matrices are defined in terms of modular arithmetic mod 2." The same article says "If the Boolean domain is viewed as a semiring, where addition corresponds to logical OR and multiplication to logical AND, the matrix representation of the composition of two relations is equal to the matrix product of the matrix representations of these relation." Note that ordinary numbers can also be viewed as a semiring with for example + being max and * being +. This does not mean that it would make sense to redefine numpy's + and * this way. I think most people expect things that can be added with + and multiplied with * to form a ring, not a semiring. If you want semiring operations you can use | and &. If you want to implement matrix algebra over a semiring, you can use logical_or.reduce and logical_and.reduce or even maximum.reduce and minimum.reduce. |
@balkin I believe your comment misses the point. A correct behavior currently exists in NumPy. It is a behavior that has a widely recognized usage. (There is a whole literature.) Removing a correct behavior and suggesting users can always find a workaround does not make sense. Very simple example: compare what computation of the (non-reflexive) transitive closure of a graph (using its Boolean matrix representation) will look like before and after the proposed change. (Hint: dot current works correctly for boolean arrays; this will break.) |
@alan-isaac - there is a discussion on this topic at the ML. I suggest that we move our dispute there. |
@balkin Rather than engaging in a dispute, I prefer to think of myself as making a request. The current behavior of + and * for boolean arrays is correct, and as a result |
Just to note, it is now only |
This should be run past the list once more so that everyone is clear on the proposal and can sign off on it. |
The tests fail because the |
@seberg If you fix the tests I'll put this in. |
Ah, now I remember why I didn't fix it earlier... The |
+1 for just teaching allclose how to compare booleans. (Possibly allclose On Mon, Feb 10, 2014 at 8:28 AM, seberg notifications@github.com wrote:
Nathaniel J. Smith |
allclose should not delegate to array_equal for integer types, as the This perhaps implies that arguments of allclose should be cast to floats/complex |
I fixed that, but there is a question remaining for me actually... Right now I did this with a deprecation warning since I thought we will remove the functionality. But removing it will probably cause it to be upcast (not sure to what exactly right now) unless we raise an error explicitly. So it should be a FutureWarning or do we plan on raising an error explicitly. |
Okay, right, my off the cuff suggestion wasn't so good. Let me try making In np.allclose, maybe we should just always cast all non-inexact types to On Mon, Feb 10, 2014 at 10:20 AM, seberg notifications@github.com wrote:
Nathaniel J. Smith |
Could do the cast to an "at least" float type. That way that minimum int stuff gets fixed... For this it doesn't really matter. |
Sounds like a good approach. |
OK, I did that by casting |
Boolean - is not well defined, especially the unary and binary operator are not compatible. In general boolean minus seems to have no real application and does not do what might be expected. All "allclose" type functions (numpy, tests, masked) have to now check for boolean to avoid the deprecation warning. In the future one could think about removing it again and just allowing the upcast.
# make sure y is an inexact type to avoid abs(MIN_INT); will cause | ||
# casting of x later. | ||
dtype = result_type(y, 1.) | ||
y = array(y, dtype=dtype, copy=False) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Need to import array
in line 796.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Heh, I seem to keep forgotting to run the tests... should those imports really be there and not on the module level?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Then again, they are for all, so it would be a larger change to really do that right.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are a lot of imports like that in this file. I'd be inclined to clean them up sometime, but it isn't needed for this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, put the imports there, if you guys prefer another way to do this (I mean the result_type thing), I can change it. Added a simple extra test, because I was a bit worried I overlooked things with result_type and array-likes.
@seberg Test failure due to missing |
A3E2 | # casting of x later. | |
dtype = result_type(y, 1.) | ||
y = array(y, dtype=dtype, copy=False) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You seem to have acquired an insatiable taste for blank lines ;)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, like blank lines :)... but two blank lines is just insane (and that one after y is nonsense too...). fixed.
Casting y to an inexact type fixes problems such as abs(MIN_INT) < 0, and generally makes sense since the allclose logic is inherently for float types.
This should fix allclose for minimum integers also, right? |
Yes, tests are in place for that. (probably not the magic closes in the commit message) |
OK, merging. Thanks Sebastian. I'll hunt down the other tickets dealing with minimum ints. |
DEP: Deprecate boolean math operations
Thanks. |
One thing we might want to do in the future is have more explicit error messages for non-numeric + boolean types. Currently they fail with |
introduced in numpygh-4105/ab04e1ae0e8eca717bc7e42f3b0a60c9ff764289
The bitwise or logical operations should be used instead,
after a while.
To be honest, I am a bit unsure here. While it is nice, the impact out there might be rather big (this currently does not include the fixes necessary in ma and some tests, so failing).