-
-
Notifications
You must be signed in to change notification settings - Fork 25.8k
wrong input into entropy function #14497
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
working on it |
Are you sure? Our entropy implementation normalises the distribution.
|
I think you might be right. I just saw entropy using np.unique which I believe would take care of the issue |
Please consider example from collections import Counter As you can see, vec_a is cluster vector, does not has same length as freq_table. The issue here is not normalizing the input (same length), but rather using cluster vector, instead of using frequency table as input. |
We don't use stats.entropy. our entropy does the right thing for this input.
|
Thanks. Last question can you point to the source of used entropy function
here. I can not find it.
On Mon, Jul 29, 2019 at 15:31 Joel Nothman ***@***.***> wrote:
We don't use stats.entropy. our entropy does the right thing for this
input.
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#14497?email_source=notifications&email_token=ABX74MTU7TJD7F6YFHRJFCDQB2TF7A5CNFSM4IHNFGH2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD277RRY#issuecomment-515897543>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ABX74MWO5SQSHWNPGILKVTTQB2TF7ANCNFSM4IHNFGHQ>
.
--
Sincerely
NGUYEN Minh Phuc (Mr)
Tel: (84) 1 696 111 003
Skype: nasdap1112
Facebook: http://www.facebook.com/minhsphuc12
|
It's defined in the same file
|
Thanks problem solved. Root cause is I do not aware that averaging method used in my sklearn version is ‘max’, instead of ‘arithmetic’ of 0.22. |
How would you want to update the example? |
Closing this one as it is quite old, the main issue has been solved and there was no answer to the last question. |
scikit-learn/sklearn/metrics/cluster/supervised.py
Line 751 in 74ae6a0
entropy must be inputed with frequency table of
labels_true
/labels_pred
, not those array directly.Example solution
The text was updated successfully, but these errors were encountered: