-
-
Notifications
You must be signed in to change notification settings - Fork 25.8k
fowlkes_mallows_score returns nan in binary classification #8101
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
The implementation looks quite strange and I'm not persuaded (despite reviewing cosmetic things in the original PR, where there were two other reviews) that the testing suite is sufficient, or even that the reference examples are correct. It's clear from the score > 1 that something's broken. However, your claim of the correct values is ignoring the fact that these are true positives defined as pairs of points that co-clustered in both y_true and y_pred. You're also missing the fact that unlike the classification "confusion matrix" interpretation, clustering metrics need to be invariant to a permutation of labels. So you should get fmi(y_true, y_pred) == fmi(y_true, 1-y_pred) == fmi(1-y_true, y_pred) == fmi(1-y_true, 1-y_pred) for y_true and y_pred in (0,1). TP should be calculated as |
Thanks for your response. You're right, the code example above does not account for label permutations, I was only using this algorithm for classification performance evaluation. |
FMI isn't for classification, precisely because it handles label non-identification. |
Though I think, ignoring the pairwise factor, the formula is more generally the geometric mean of precision and recall which is, I think, used on occasion (rarely; harmonic, i.e. f-score, and arithmetic means are more common) for classification. |
I suppose that calinsky-hararabaz should also be reviewed for correctness.
…On 22 December 2016 at 14:56, Felix Last ***@***.***> wrote:
Thanks for your response. You're right, the code example above does not
account for label permutations, I was only using this algorithm for
classification performance evaluation.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#8101 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAEz6x6RDyREU_7xkZneScYIK45hzPZZks5rKfTYgaJpZM4LTj9l>
.
|
You're right, what I want in order to rate the classification is the geometric mean of precision and recall, so FMI isn't quite right. Anyways, found this bug along the way. |
Thank you. I think it would be a good idea to have something like |
Hi, I want to work on this one if its still there. |
It appears to be open. I await your PR. |
tk, pk and qk follow the same equation as the reference given in the documentation. The above code gave me an error on |
Hi @gan3sh500 , thanks for your response. But could you please let me handle this bug. |
I can't really analyse these small remarks. Things get much more concrete with a pull request and test cases. |
@jnothman , working on it. will create a pull request soon. |
@felix-last , I am not being able to generate values that are not between 0 and 1 or
@jnothman, The reference is calculating the same thing but in a different way(just go to starting of page 554 from proof). Please correct me if you find something inconsistent about these facts. |
i'm happy if the implementation is correct, and find that believable. It still needs more tests, IMO. It only seems to have one non-trivial test. |
Perhaps @flex-last is giving arrays containing nan value(s) as input, I don't know exactly. |
@devanshdalal, have you tried with two large input vectors like |
I tried using your program only. I was unable to generate the any error you reported. Everything seemed perfect. We are using different configurations. Can it be the issue? I am getting this.
|
@jnothman , I have just started here. Please direct me on where and which kind of tests u want to add, I will add those test cases. |
I wonder if there are overflow errors happening in Windows leading to the NaN...? |
@devanshdalal: I'd just hoped there was more than one unit test using a hand-crafted example and checking the result was correct. Indeed, a test for symmetry would also be appropriate. |
@jnothman , Can it be a bug in Numpy functions we are using in windows? Well, I will try in my windows and see if the errors persist? I will add the tests in the next pull request for the issue ? |
Description
fowlkes_mallows_score doesn't work properly for large binary classification vectors. It returns values that are not between 0 and 1 or returns
nan
. In general, the equation shown in the documentation doesn't yield the same results as the function.Steps/Code to Reproduce
Edited by @jnothman: this reference implementation is incorrect. See comment below.
Expected Results
Should be 0.487888392921
Is 0.487888392921
Should be 0.548853049023
Is 0.548853049023
Actual Results
Should be 0.487888392921
Is 15.3260054113
Should be 0.548853049023
Is 0.501109879279
Versions
Windows-10-10.0.10586-SP0
Python 3.5.2 |Anaconda custom (64-bit)| (default, Jul 5 2016, 11:41:13) [MSC v.1900 64 bit (AMD64)]
NumPy 1.11.2
SciPy 0.18.1
Scikit-Learn 0.18.1
The text was updated successfully, but these errors were encountered: