-
Notifications
You must be signed in to change notification settings - Fork 24.3k
torch.cdist returns inconsistent result #37734
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
The first way also produces |
I wonder if the cause is different block reduction configuration used, and then the problem is really nasty: the precision is quite high and only 39 elements are being reduced |
The issue of pytorch/aten/src/ATen/native/Distance.cpp Lines 89 to 96 in 090ea77
|
Oh, I see. It makes sense there's a difference, but isn’t it too much difference? all of these inputs are in 0-1 range, quite dense region. |
using |
Note that running this same code on my mac gives:
Which is not great but not as bad :D But I think it is still worth adding a note in the doc stating that the mm version might be less precise. |
It's also 0.0003 on the GPU, so mkl must be doing something funny. |
We have seen as well that the number of used threads for CPU can have a large impact on the precision of these functions in the worst case scenarios. |
In reality, underlying precision can be even worse. Probably it's exactly zero because it's negative and then clamped. |
It's only 39 reduced values here per mm cell. I hope only one thread/cell is used :) It can also be that different width of the accumulator is used? |
About error analysis: I guess it'd be sth similar to https://www.johndcook.com/blog/2008/09/28/theoretical-explanation-for-numerical-results/. In this post it's shown that matrix multiply method is much worse than the difference method (for standard deviation computation) |
About this, @ngimel do we have documented anywhere the single vs double precision accumulators that are used for CPU/GPU operators? |
No (in part because we are not too consistent about it, in part because for calls offloaded to libraries, such as matrix multiplication, it would depend on the underlying library) |
A similar issue in sklearn also with fingers pointed at MKL: scikit-learn/scikit-learn#11711 |
I guess actionable "feature requests" are:
|
Oh wow. I didn't know cdist supported computation algorithm choosing: #42479 |
Uh oh!
There was an error while loading. Please reload this page.
Both ways should have same precision properties. It is very strange that they return different results: 0.0004 is quite a large difference
bug.pt.gz
cc @jlin27 @mruberry
The text was updated successfully, but these errors were encountered: