-
-
Notifications
You must be signed in to change notification settings - Fork 25.9k
ValueError in distance matrix with agglomerative clustering #10076
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
You could get NaN cosine values if a vector has no non-zero elements. Is
this possible in your case?
|
Hi @jnothman , |
So is it fine to close this? |
Yes, although I wonder if it would be better if the distance of two zero-valued vectors should be simply zero instead of non-finite. You think it makes sense to track it down or is this expected behaviour? |
Actually this is a duplicate of #7689, so see there... |
For me, the problem was that the gram_matrix contained identical observations, which meant that the condensed distance matrix contained only zeros. |
I've discovered that all 1's will cause the same error. I searched for these |
Description
ValueError thrown when applying AgglomerativeClustering on textual data because distance matrix contains infinite values
Steps/Code to Reproduce
Expected Results
No error is thrown and the distance matrix should not contain infinite values
Actual Results
Versions
Comment
I have used the same code on a subset of Reuters-21578 text data set and no error was thrown. I was not able to track down what might have caused the infinite values in the distance matrix
The text was updated successfully, but these errors were encountered: