-
-
Notifications
You must be signed in to change notification settings - Fork 10.9k
API: Switch to NEP 50 behavior by default #23912
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
||
# gh-10322 means that the type comes from arr - this may change | ||
assert_equal(x_loc.dtype, float_small) | ||
assert_equal(x_loc.dtype, float_large) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@eric-wieser these histogram changes look right to you? What was previously rounded to the small bin, now ends up with the large dtype because it is explicit, I guess.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIW, I think this is logical as long as one doesn't special-case histogram in treating its arguments asymmetrically.
Really need some opinions on this. I hacked
This is an API decision that outlines the difficulty with NEP 50: where to draw the line and disregard the "weakly typed" float there. For percentile it may yet be a bit simpler, but still:
Ping @mhvk, but really everyone should have an opinion on this, this is probably the most important thing to figure out for NEP 50 and I think a lot of people want NEP 50. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looked through the whole PR. I think the main strike against NEP 50 consequences is for comparisons; really, np.uint*(anything) > -1
should just work. Is the idea right now to decide on that separately?
Looking through the errors: it should be work to do np.can_cast(np.iinfo(whatever).min, whatever.dtype)
-- maybe those attributes need to be of whatever.dtype
?
On the percentile failures - I think I'm for option (1) here, ensuring that we keep the weak promotion that is being tested for. It will be really confusing otherwise!
NPY_ITER_READONLY | NPY_ITER_ALIGNED, | ||
NPY_ITER_READONLY | NPY_ITER_ALIGNED | ||
}; | ||
PyArray_Descr * common_dt = PyArray_ResultType(2, &op_in[0] + 2, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Realize this was here before, but while you are editing this, is &op_in[0]+2
not just the same as &op_in[2]
?
numpy/core/tests/test_multiarray.py
Outdated
@@ -4119,7 +4119,8 @@ def test_extension_incref_elide_stack(self): | |||
def test_temporary_with_cast(self): | |||
# check that we don't elide into a temporary which would need casting | |||
d = np.ones(200000, dtype=np.int64) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might as well change to np.ones(2, dtype=np.int64)
, or is the number so large on purpose here?
numpy/core/tests/test_multiarray.py
Outdated
@pytest.mark.parametrize("inplace", [False, True]) | ||
def test_int_range_error(self, inplace): | ||
# E.g. clipping uint with negative integers fails to promote | ||
# (changed with NEP 50 and may be adaptable) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yikes. Here, in principle it would be nicer for the code to be smarter and realize that -1
and 256
are just not possible, so those limits should be ignored. But I see the need to avoid the draw of value-based promotion...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are possible solution, but right now they need a fair bit of complexity (maybe some adaptation too, but not much). I think we need a ufunc implementation for (any_integer, pyinteger)
and (pyinteger, anyinteger)
.
And either we allow access to the scalar values (if an abstract scalar) in one of the steps (promotion or loop selection), or (which is possible now) we put it into the inner-loop itself.
I.e. you can effectively make it a loop that uses the Python object for the scalar (and then deal with the rest in the loop itself).
That ain't pretty, although fortunately it would only be necessary for comparisons.
(maybe there are other ways to tweak it to be a bit more reasonable, not sure)
numpy/core/tests/test_multiarray.py
Outdated
assert_(not -1 > np.array(1, dtype=dt1), "type %s failed" % (dt1,)) | ||
assert_(-1 != np.array(1, dtype=dt1), "type %s failed" % (dt1,)) | ||
# NEP 50 broke comparison of unsigned with -1 for the time being | ||
# (this may be fixed in the future of course). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another painful result, indeed.
numpy/core/tests/test_ufunc.py
Outdated
# type resolution to kick in, which then fails): | ||
with assert_raises(TypeError): | ||
_rational_tests.test_add(a, np.uint16(2)) | ||
# This path used to go into legacy promotion, but doesn't now: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe "This path with a scalar used ..."
numpy/core/tests/test_umath.py
Outdated
# the test should check equivalence to Python (no overflow). | ||
# If not, the test can be simplified a bit more eventually. | ||
with pytest.raises(OverflowError, match="Python integer -1"): | ||
np_comp(arr, -1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Salvaging this would indeed be good.
numpy/core/tests/test_umath.py
Outdated
# but promotion rules mean that we try to use a normal integer | ||
# in the following case: | ||
with pytest.raises(OverflowError): | ||
np.equal(2**200, 2**200) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ouch, these also are not great. One wonders about just delegating to standard operators/math
here...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmmmmmm, some thoughts:
- For comparisons, yes, I think we can just allow this by saying that
PyInt + PyInt -> bool
(i.e. a comparison loop) can just useobject + object -> bool
. For float or complex it doesn't really matter anyway (well we currently allow complex<
, but whatever). - Beyond that, I think things may get tricky:
- Some functions behave differently for objects or may not have an object loop at all, so I am not sure you can do this choice universally.
- I might start wondering if we should do this more generally, which then diverges a bit into the whole problem of returning scalars vs. arrays.
Let's focus on the first part (comparisons). I wonder if it would be very useful if we don't also do the general int64(2) < 2**200
implementation/hack, but I suppose it may remove some pressure at least from test suits that use NumPy asserts all over.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it makes sense to remove pain points in order of simplicity! Certainly, deferring PyInt, PyInt
to O,O
for comparisons seems logical. Let's move out further discussion of comparisons to the main thread.
numpy/lib/histograms.py
Outdated
return np.subtract(a, b, casting='unsafe', dtype=dt) | ||
# signed to unsigned. The input may be negative python integers so | ||
# ensure we pass in arrays with the initial dtype (related to NEP 50). | ||
return np.subtract(np.asarray(a, dtype=dt), np.asarray(b, dtype=dt), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't it easier to just do np.asanyarray(a - b, dtype=unsigned_dt)
- I don't think we're getting any speed-up doing it another way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It doesn't work because if a
or b
is a negative Python integer, the conversion will raise an error (rather being unsafe and "well defined").
EDIT: For clarification, the other one could be an int8
also, and then the mix doesn't work. (I think at least.)
|
||
# gh-10322 means that the type comes from arr - this may change | ||
assert_equal(x_loc.dtype, float_small) | ||
assert_equal(x_loc.dtype, float_large) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIW, I think this is logical as long as one doesn't special-case histogram in treating its arguments asymmetrically.
I made a PR to your branch at seberg#46 - this fixes Aside: it also fixes a bug for |
Of course, you are right in https://github.com/seberg/numpy/pull/46/files#r1261584859 that this problem may be more general -- I guess that is an argument for not worrying so much and instead remove those |
Looking again at NEP 50, the main rationale is that values should not determine the result type (correctly!). That also means that for comparisons there is a bit of leeway, since the result type is always bool. And I do think we need to avoid that One option is to make the behaviour of the comparison operator different from the ufunc (as it is already), and for the operator try to ensure that an error is raised really only when necessary - i.e., let the operator check what is the problem if there is an |
Right, comparisons are special and it is indeed very reasonable to allow Putting a special case into the operator is a good idea! The worse hack but it should be a straight forward one, at least. |
da8f360
to
b69e88c
Compare
4097765
to
15421fd
Compare
I would like some input, I opened a "tracking issue" for this, but how to approach/continue here? I think we need to push forward soon and cut corners, because there are too many moving parts for me to keep track off, and I could probably to reduce some of them if we disable Right now, this works, the two main problems are (as mentioned many times):
Of course there is more, many functions in principle should use weak-promotion rules but are not doing so yet due to early The only test failure (that isn't explicitly |
@seberg - Ideally, we have a PR that fixes the comparisons before merging this (I don't worry about |
15421fd
to
8359c36
Compare
Well, I have given this a shot after some thoughts... It is still tricky. So aside from changing
Additionally, there is a promoter to say that Python integer comparisons can just use Why did we land on that?
In short, I don't love The main alternative I still see is actually defining an Admittedly, one argument might be that comparisons with |
a47f9e3
to
441b5d5
Compare
@seberg - the progress sounds good, but at least for me this PR is now far too large to meaningfully review. Would it at all be possible to split it up in pieces? E.g., at least in principle, it would seem that fixing the comparisons does not require enabling NEP 50 behaviour by default. Same for, say, the In order to allow more incremental progress, is there already a CI run that has |
9bb252b
to
cc17c38
Compare
@mhvk I first planned to just put the new stuff into its own PR, but thought I would push it here because I wanted to test the full thing. I squashed the change, but only to split it out easier into gh-24915. |
This contains two fixes: 1. a somewhat hackish solution to keep allowing very large integers for unary ufuncs. This is partially wrong, but keeps isnan and other functions working (reducing fallout in the tests quite a bit) 2. Generally adapt changes that now fail for the same reason, this is mainly due to comparisons like `-1 < uint8` now failing in generally. That is fixable in principle, but it isn't super easy. Neither is perfect, both are the pragmatic solutions at this time until they proof to be too problematic for some reason. (Getting rid of going to object for large integers is tougher and needs deeper work. It certainly is possible but too difficult at this stage: we need to remove some old branches first. We could, and maybe should, get rid of the `uint` step though, it is even more of a trap I think!)
Maybe not the best way, but its functional and easy to grok and I doubt it matters speed wise.
Readding invalid warning ignoring, even though it should not matter (it does seem to for macos arm64 tests, maybe some M2/clang version issue...).
This doesn't seem like an ideal solution, but is "minimal"?
cc17c38
to
6707f39
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think with the comparisons done, this is almost all OK. The one ufunc that still is a bit nasty is clip
- I think it makes sense to use resolve_descriptors_raw
for it too.
Otherwise, all small comments.
|
||
/* Get the result from the iterator object array */ | ||
ret = (PyObject*)NpyIter_GetOperandArray(iter)[0]; | ||
if (common_dt == NULL || op_dt[1] == NULL) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might as well move these failures up to single tests right after definition
goto fail; | ||
} | ||
iter = NpyIter_MultiNew(4, op_in, flags, | ||
NPY_KEEPORDER, NPY_UNSAFE_CASTING, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Keep indentation consistent?
numpy/_core/tests/test_dtype.py
Outdated
# or object depending on the input value. So test those paths! | ||
expected_dtype = np.result_type(np.array(val).dtype, np.array(0).dtype) | ||
# If we only pass scalars (mainly python ones!), NEP 50 means | ||
# that we get either the default integer |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
delete "either"
# Similar to last check in `test_basic` | ||
x = (np.random.random(1000) * 255).astype("uint8") | ||
with pytest.raises(OverflowError): | ||
x.clip(-1, 10, out=x if inplace else None) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, should we change clip
as well to use resolve_descriptors_raw
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You will have to do a 3 way (you have to check two arguments) dispatching stuff which will be seems like a lot of trouble for a niche feature. So it seems like a lot of trouble for something very niche? It isn't even clear to me that the test is testing anything we expect users to do.
(I may be wrong, but I expect more floating point uses and this type of clip call is only relevant when you don't know the input integer dtype and also do not want to cast anyway.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
clip
goes through python code in _methods.py
that already special-cases min
and max
values:
Lines 92 to 101 in d885b0b
def _clip(a, min=None, max=None, out=None, **kwargs): | |
if min is None and max is None: | |
raise ValueError("One of max or min must be given") | |
if min is None: | |
return um.minimum(a, max, out=out, **kwargs) | |
elif max is None: | |
return um.maximum(a, min, out=out, **kwargs) | |
else: | |
return um.clip(a, min, max, out=out, **kwargs) |
I made a PR with a possible fix -- seberg#48 -- though at some level that just brings up the next question: if clip
, why not also np.maximum
and np.minimum
(one-sided only, i.e., np.minimum(int_array, py_int)
should error if py_int
is less than the minimum possible given the dtype, but just pass through int_array
if it is larger than the maximum possible).
In that respect, maybe editing the python function is the right call, since that already special-cases None
as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, I was tempted to make it a follow-up if there is an actual complaint. Do you really think it is important?
Probably only a theoretical caveat, but all of these function (for clip, I think it is a bad choice), return the result type of mixing everything. That is slightly surprising to get something that can't hold all values. Although, I have to admit, it isn't really problematic in practice.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After I wrote my change, I must admit I became unsure, especially after having to adjust the tests for numpy ints to not do in-place -- as you say, this makes very little sense! I'm definitely fine with just raising an issue that the clip
dtype resolution needs more work and keeping things as you have them here.
numpy/_core/tests/test_umath.py
Outdated
@@ -422,13 +422,14 @@ def test_unsigned_signed_direct_comparison( | |||
|
|||
assert py_comp(arr, -1) == expected | |||
assert np_comp(arr, -1) == expected | |||
|
|||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One empty line...
numpy/_core/tests/test_umath.py
Outdated
scalar = arr[0] | ||
assert isinstance(scalar, np.integer) | ||
# The Python operator here is mainly interesting: | ||
assert py_comp(scalar, -1) == expected | ||
assert np_comp(scalar, -1) == expected | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Two empty lines between classes is good.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
strange, probably some conflict resolution
numpy/lib/_function_base_impl.py
Outdated
q = np.asanyarray(q) | ||
# Use dtype of array if possible (e.g., if q is a python int or float). | ||
if isinstance(q, (int, float)) and a.dtype.kind == "f": | ||
q = np.asanyarray(q, dtype=np.result_type(a, q)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm confused about this: can it not just be dtype=a.dtype
?
numpy/lib/_nanfunctions_impl.py
Outdated
q = np.asanyarray(q) | ||
# Use dtype of array if possible (e.g., if q is a python int or float). | ||
if isinstance(q, (int, float)) and a.dtype.kind == "f": | ||
q = np.asanyarray(q, dtype=np.result_type(a, q)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As above, just dtype=a.dtype
? I know I looked at this before, so probably there was a reason for this... If so, let's add it to the comment...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nah, I think it was added for good sport probably. It might do byte-swaps, etc. but that doesn't matter much.
There may have been a reason: slightly saner user dtype support. But it seems so niche (and unlikely to affect anything), that I am happy to ignore it for now.
u8 | AR | ||
u8 ^ AR | ||
u8 & AR | ||
i << AR |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bit of an illogical place for this. Move up, just after the i8
set?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@seberg - This looked ready to go in (just one small comment, which you can ignore).
To see how things were, I thought I'd testit with astropy. It seems mostly fine -- there are just a few cases where we need to adjust to ensure casting works fine, mostly for float32. Quite a few of those were corner cases already so no problem. And some work-arounds could be removed.
But I also ran into what I think is a similar problem to what we had with integers for comparisons: large_py_float > float32
raises an OverflowError
when it should just be true. I think we should deal with this... (note that I could adjust the piece of code that fails, but it seems wrong in the same way the integers were wrong).
Hmmmm, interesting and also just some notes for now because I have to sort my thoughts to make up my mind:
The last point means that the warning actually warns about relevant behavior (although typical user is unlikely to understand of course). Fixing 2. to get rid of the warning seems like a dubious choice to me unless we also fix 1. which means we introduce explicit For integers it was a clear cut thing were it just makes sense to allow as user API and the difficulty is a technical. But here it is more fuzzy than that! What does everyone think? |
@ngoldbaum can you have a (last?) look over. I would ping everyone else too, but not sure who wants to look at it. |
Sure, I'll take a look on Monday. |
The actual comparison that failed (warned) is
where I think I am now convinced that in the end that would require guessing what the user wants, which makes this unsolvable (and hard to explain). The other solution to my specific example would be to make all the attributes of Anyway, for this PR we clearly do not need to solve this. What seems most important is to document this example in perhaps a general casting page... |
…tins) But move the error check to be higher up and thus more directly relevant. If `PyArray_DescrFromType` could have failed, the code would have been incorrect anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be great for @ngoldbaum to have a last look, but in the meantime let me explicitly approve, since I think this is ready to go in...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I saw a few places that could be clarified a bit and I have a couple questions.
doc/source/user/basics.creation.rst
Outdated
are handled in C/C++ functions. If you are not careful with ``dtype`` | ||
assignments, you can get unwanted overflow, as such | ||
are handled in C/C++ functions. | ||
When values do not fit and you are using a ``dtype`` NumPy may raise an |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When values do not fit and you are using a ``dtype`` NumPy may raise an | |
When values do not fit and you are using a ``dtype``, NumPy may raise an |
doc/source/user/basics.creation.rst
Outdated
|
||
>>> a = np.array([127, 128, 129], dtype=np.int8) | ||
>>> a | ||
>>> np.array([127, np.int64(128), np.int64(129)], dtype=np.int8) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel like this is enough of a corner case that it'll be confusing more often than helpful to include this in the "basics of array creation" docs. I'm not saying it needs to be added for this PR, but it would probably be more useful to have a new subsection in reference/arrays.ndarray.rst
that goes into how rounding and overflow work in detail with the NEP 50 semantics. For now I would just delete this second example you've added.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, removed it. I agree, I don't like it much, I just didn't really want to remove ahe note about overflows being possible completely.
False | ||
|
||
>>> np.can_cast('<i8', '>u4', 'unsafe') | ||
True |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we leave some examples behind that demonstrate the new semantics for numpy scalars?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It just raises a TypeError
right now, so not sure it is very useful?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, that wasn't clear, never mind.
@@ -3252,8 +3252,9 @@ array_set_datetimeparse_function(PyObject *NPY_UNUSED(self), | |||
NPY_NO_EXPORT PyObject * | |||
PyArray_Where(PyObject *condition, PyObject *x, PyObject *y) | |||
{ | |||
PyArrayObject *arr, *ax, *ay; | |||
PyArrayObject *arr, *ax, *ay = NULL; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might as well initialize all of these to NULL
, leaving it like this is a little confusing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added, although I can't say I prefer it when it isn't needed for cleanup (the "uninitialized" warnings work pretty well anyway).
if (common_dt == NULL) { | ||
goto fail; | ||
} | ||
PyArray_Descr * op_dt[4] = {common_dt, PyArray_DescrFromType(NPY_BOOL), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Couldn't hurt to add a \* creating a built in dtype like this cannot fail *\
above this line.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, we really should probably just have a name for it...
ay = (PyArrayObject*)PyArray_FROM_O(y); | ||
if (ax == NULL || ay == NULL) { | ||
if (ay == NULL) { | ||
goto fail; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for writing this with awkward control flow in the first place... 🙃
* isnan). This is not ideal, but pragmatic... | ||
* We should eventually have special loops for isnan and once | ||
* we do, we may just deprecate all remaining ones (e.g. | ||
* `negative(2**100)` not working as it is an object.) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be nice if this comment mentioned it was a TODO from the NEP 50 work, for context
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have it in the tracking issue...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm thinking of a reader who doesn't have context about NEP 50 and wouldn't be aware of a tracking issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, sorry, somehow was thinking about creating an issue.
Merging with a quick fixup for the linter. Thanks @seberg! If you're looking at this PR because of CI breakage, take a look at NEP 50: this PR turns on NEP 50 semantics by default. The result of some binary operations may end up in lower precision than they used to and some operations that used to succeed will now raise errors. If you are finding it difficult to update your code to adapt to these new semantics or have any other questions, please feel free to open an issue. |
* 3 test failures show up with latest NumPy: https://github.com/scipy/scipy/actions/runs/6632037087/job/18016857190?pr=19419 * almost certainly because of NEP50 adoption/changes: numpy/numpy#23912 * it looks to me like the only special handling we need here is for floats, so `can_cast` may have been overkill anyway, although if you follow the control flow/docstrings, it gets a bit confusing with some things claiming to use `np.intp` and others Python `int`... [skip cirrus] [skip circle]
This is a WIP for discussion and visibility for switching to NEP 50 behavior by default.
As of writing, it fixes up mainly clear tests and one small hack to allow
isnan(2**100)
to pass. That isn't perfectly clean maybe, but seems pragmatic.(The individual commits make sense on their own)
There are some remaining tests that need fixing:
percentile(float32_arr, 30.)
return a float32 or float64 (currently it switches to float64), but we could addnp.result_type
to fix that.np.result_type
isn't great, it would be nice to have something slightly cleaner. I am not exactly sure what, though. (A function to prep multiple arrays, something that @mdhaber would like anyway. A way to mark the array as representing a Python scalar? But we want that to be used with caution, it would be weird to be used beyond temporaries)can_cast(123, np.uint8)
isn't well defined right now. I can live with that for the moment.np.where()
doesn't do the weak-promotion,np.concatenate()
does, so it should not be super hard to fix.np.add(1., 2., dtype=np.int64)
should fail because of unsafe casting, but casting safety+scalars is still a bit of a mess.I am not sure that all of these are easy, but will chip away. OTOH some of these also look like things that might be OK as "known issues".
The main thing I would like is figure out
percentile
: We won't get this right everywhere, but it would be good to know what our story is for such a function.