-
-
Notifications
You must be signed in to change notification settings - Fork 10.9k
numpy.gradient() doesn't compute boundary values correctly in version 1.9.0 #5184
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
It looks like changes were introduced in 332d628 that were intended to make the gradient calculation at boundaries more accurate, but the documentation on the website only goes up to 1.8.1. You can see the updated documentation with |
This is also the source of matplotlib/matplotlib#3598 |
The change makes the gradient second order at the boundaries. In the |
Because the new method breaks reproducibility with old code, we should probably make the improvement optional. |
See the recent thread http://thread.gmane.org/gmane.comp.python.numeric.general/58971/focus=58975. Does someone want to make a PR for this? We need to decide on a keyword. |
The real issue for matplotlib is the behaviour on masked arrays: Ie the following taken from a matplotlib test and rounded to make easier to read: import numpy as np
y, x = np.mgrid[-1.2:1.2:8j, -1.2:1.2:8j]
z = 10 * np.cos(x**2 + y**2)
zmask = np.ma.masked_greater(z, 9.9)
z1 = zmask.copy()
print('z before\n')
print(np.round(z1))
dx, dy = np.gradient(z1)
print('\nz after\n')
print(np.round(z1))
print('\ndx\n')
print(np.round(dx))
print('\ndy\n')
print(np.round(dy)) With numpy 1.8.2 this results in: z before
[[-10.0 -6.0 -1.0 1.0 1.0 -1.0 -6.0 -10.0]
[-6.0 1.0 5.0 7.0 7.0 5.0 1.0 -6.0]
[-1.0 5.0 9.0 10.0 10.0 9.0 5.0 -1.0]
[1.0 7.0 10.0 -- -- 10.0 7.0 1.0]
[1.0 7.0 10.0 -- -- 10.0 7.0 1.0]
[-1.0 5.0 9.0 10.0 10.0 9.0 5.0 -1.0]
[-6.0 1.0 5.0 7.0 7.0 5.0 1.0 -6.0]
[-10.0 -6.0 -1.0 1.0 1.0 -1.0 -6.0 -10.0]]
z after
[[-10.0 -6.0 -1.0 1.0 1.0 -1.0 -6.0 -10.0]
[-6.0 1.0 5.0 7.0 7.0 5.0 1.0 -6.0]
[-1.0 5.0 9.0 10.0 10.0 9.0 5.0 -1.0]
[1.0 7.0 10.0 -- -- 10.0 7.0 1.0]
[1.0 7.0 10.0 -- -- 10.0 7.0 1.0]
[-1.0 5.0 9.0 10.0 10.0 9.0 5.0 -1.0]
[-6.0 1.0 5.0 7.0 7.0 5.0 1.0 -6.0]
[-10.0 -6.0 -1.0 1.0 1.0 -1.0 -6.0 -10.0]]
dx
[[4.0 7.0 7.0 6.0 6.0 7.0 7.0 4.0]
[4.0 6.0 5.0 4.0 4.0 5.0 6.0 4.0]
[3.0 3.0 2.0 -- -- 2.0 3.0 3.0]
[1.0 1.0 0.0 -- -- 0.0 1.0 1.0]
[-1.0 -1.0 -0.0 -- -- -0.0 -1.0 -1.0]
[-3.0 -3.0 -2.0 -- -- -2.0 -3.0 -3.0]
[-4.0 -6.0 -5.0 -4.0 -4.0 -5.0 -6.0 -4.0]
[-4.0 -7.0 -7.0 -6.0 -6.0 -7.0 -7.0 -4.0]]
dy
[[4.0 4.0 3.0 1.0 -1.0 -3.0 -4.0 -4.0]
[7.0 6.0 3.0 1.0 -1.0 -3.0 -6.0 -7.0]
[7.0 5.0 -- -- -- -- -5.0 -7.0]
[6.0 4.0 -- -- -- -- -4.0 -6.0]
[6.0 4.0 -- -- -- -- -4.0 -6.0]
[7.0 5.0 -- -- -- -- -5.0 -7.0]
[7.0 6.0 3.0 1.0 -1.0 -3.0 -6.0 -7.0]
[4.0 4.0 3.0 1.0 -1.0 -3.0 -4.0 -4.0]] Where as with numpy 1.9 this results in. z before
[[-10.0 -6.0 -1.0 1.0 1.0 -1.0 -6.0 -10.0]
[-6.0 1.0 5.0 7.0 7.0 5.0 1.0 -6.0]
[-1.0 5.0 9.0 10.0 10.0 9.0 5.0 -1.0]
[1.0 7.0 10.0 -- -- 10.0 7.0 1.0]
[1.0 7.0 10.0 -- -- 10.0 7.0 1.0]
[-1.0 5.0 9.0 10.0 10.0 9.0 5.0 -1.0]
[-6.0 1.0 5.0 7.0 7.0 5.0 1.0 -6.0]
[-10.0 -6.0 -1.0 1.0 1.0 -1.0 -6.0 -10.0]]
z after
[[-- -6.0 -- -- -- -- -6.0 --]
[-6.0 1.0 5.0 7.0 7.0 5.0 1.0 -6.0]
[-- 5.0 -- -- -- -- 5.0 --]
[-- 7.0 -- -- -- -- 7.0 --]
[-- 7.0 -- -- -- -- 7.0 --]
[-- 5.0 -- -- -- -- 5.0 --]
[-6.0 1.0 5.0 7.0 7.0 5.0 1.0 -6.0]
[-- -6.0 -- -- -- -- -6.0 --]]
dx
[[4.0 8.0 9.0 -- -- 9.0 8.0 4.0]
[4.0 6.0 5.0 4.0 4.0 5.0 6.0 4.0]
[3.0 3.0 2.0 -- -- 2.0 3.0 3.0]
[1.0 1.0 0.0 -- -- 0.0 1.0 1.0]
[-1.0 -1.0 -0.0 -- -- -0.0 -1.0 -1.0]
[-3.0 -3.0 -2.0 -- -- -2.0 -3.0 -3.0]
[-4.0 -6.0 -5.0 -4.0 -4.0 -5.0 -6.0 -4.0]
[-4.0 -8.0 -9.0 -- -- -9.0 -8.0 -4.0]]
dy
[[-- 4.0 -- -- -- -- -4.0 --]
[8.0 6.0 3.0 1.0 -1.0 -3.0 -6.0 -8.0]
[-- 5.0 -- -- -- -- -5.0 --]
[-- 4.0 -- -- -- -- -4.0 --]
[-- 4.0 -- -- -
8000
- -- -4.0 --]
[-- 5.0 -- -- -- -- -5.0 --]
[8.0 6.0 3.0 1.0 -1.0 -3.0 -6.0 -8.0]
[-- 4.0 -- -- -- -- -4.0 --]] I.e many more masked values in the gradients and a modification of the original array. Perhaps we should never have used gradient on masked arrays? |
I created a pull request (#5186) for this issue, which seems like a simple fix. I used |
I'm thinking |
|
Raising an error for e.g. |
Since the second argument is Would it be appropriate to use
I just want to make sure I use an accepted convention for numpy. |
gradient does not match any other conventions, so we're free to do whatever **kwargs is a reasonable way to implement a py2-compatible kwarg-only kwarg_only = set(["edge_order"]) AFAICT that's equivalent to a py3 gradient(x, varargs, *, On Tue, Oct 14, 2014 at 9:36 PM, David M Fobes notifications@github.com
Nathaniel J. Smith |
Argghhh! What @njsmith said. I don't think we would accept gradient as is these days, but at this point it is an heirloom. |
I think I saw something like this as an idiom, as well. But I don't have any opinion on what is best.
|
Oh, that's pretty good :-)
|
Wouldn't be better to show all of the extra keys, not just the first one? |
@tacaswell, yeah it kind of would. But python never does it and inventing a different format just for that seems not worth it. |
…word BUG: Fixes #5184 gradient calculation behavior at boundaries
@jenshnielsen: Could you check whether the recently merged change solves your masked array problems? And if -- as I suspect -- you still see buggy effects with @ everyone who ran into problems from this: do you have an opinion on what the best long-term default here is? |
Previously, operations which created a new masked array from an old masked array -- e.g., np.empty_like -- would tend to result in the new and old arrays sharing the same .mask attribute. This leads to horrible brokenness in which writes to one array affect the other. In particular this was responsible for part of the brokenness that @jenshnielsen reported in numpygh-5184 in which np.gradient on masked arrays would modify the original array's mask. This fixes the worst part of the issues addressed in numpygh-3404, though there's still an argument that we ought to deprecate the mask-copying behaviour entirely so that empty_like returns an array with an empty mask. That can wait until we find someone who cares though. I also applied a small speedup to np.gradient (avoiding one copy); previously this inefficiency was masking (heh) some of the problems with masked arrays, so removing it is both an optimization and makes it easier to test that things are working now.
@jenshnielsen: I looked into this a bit more and just submitted #5203 which should fix the issue with the input array getting modified in place. (Also can I just take this opportunity to whinge that everything about ndarray subclassing is so incoherent that the best one can hope for is to rearrange the kluges, and renew my plea to stop using |
Previously, operations which created a new masked array from an old masked array -- e.g., np.empty_like -- would tend to result in the new and old arrays sharing the same .mask attribute. This leads to horrible brokenness in which writes to one array affect the other. In particular this was responsible for part of the brokenness that @jenshnielsen reported in numpygh-5184 in which np.gradient on masked arrays would modify the original array's mask. This fixes the worst part of the issues addressed in numpygh-3404, though there's still an argument that we ought to deprecate the mask-copying behaviour entirely so that empty_like returns an array with an empty mask. That can wait until we find someone who cares though. I also applied a small speedup to np.gradient (avoiding one copy); previously this inefficiency was masking (heh) some of the problems with masked arrays, so removing it is both an optimization and makes it easier to test that things are working now.
* Previous expected behavior was that the gradient is computed using central differences in the interior and first differences at the boundaries. * gradient was updated in v1.9.0 so that second-order accurate calculations are done at the boundaries, but this breaks expected behavior with old code, so `edge_order` keyword (Default: 1) is added to specify whether first or second order calculations at the boundaries should be used. * Since the second argument is *varargs, in order to maintain compatibility with old code and compatibility with python 2.6 & 2.7, **kwargs is used, and all kwargs that are not `edge_order` raise an error, listing the offending kwargs. * Tests and documentation updated to reflect this change. * Added `.. versionadded:: 1.9.1` to the new optional kwarg `edge_order` documentation, and specified supported `edge_order` kwarg values. * Clarified documentation for `varargs`. * Made indentation in docstring consistent with other docstring styles. * Examples corrected
It seems that numpy.gradient() computes the boundary values incorrectly in version 1.9.0 (but correctly in e.g. version 1.8.1)
To reproduce:
Alternative 1:
although the resulting array should be [(2-1)/1, (4-1)/2, (4-2)/1] = [1, 1.5, 2]
Alternative 2:
Just run the first example in the documentation for numpy.gradient():
The result is
although the documentation says that it should be
the latter also seems right, while the former seems wrong.
(see also http://stackoverflow.com/questions/26361007/numpy-gradient-seems-to-produce-erroneous-boundary-values-using-first-differe)
The text was updated successfully, but these errors were encountered: