8000 compute_class_weight() class param behaviour · Issue #4327 · scikit-learn/scikit-learn · GitHub
[go: up one dir, main page]

Skip to content

compute_class_weight() class param behaviour #4327

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
trevorstephens opened this issue Mar 3, 2015 · 2 comments
Closed

compute_class_weight() class param behaviour #4327

trevorstephens opened this issue Mar 3, 2015 · 2 comments
Labels

Comments

@trevorstephens
Copy link
Contributor

Not sure if it's relevant to the motivation behind the implementation as discussed in #4324 , but a two-class y array with two classes present in the classes param proceeds with the sum of the weights being equal to the number of classes:

compute_class_weight('auto', [0, 1], iris.target[0:100])
array([ 1.,  1.])

While a three-class y array with only two of the classes present in the classes param does something different altogether:

compute_class_weight('auto', [0, 1], iris.target[0:120])
array([ 0.66666667,  0.66666667])

I had sidestepped this in compute_sample_weight in #4190 by determining the present classes from y itself. I'm happy to open a PR to remove the param, and was going to, but while the function is somewhat private, it is exposed in partial_fit in BaseSGDClassifier:

"In order to use 'auto' weights, use compute_class_weight('auto', classes, y)."

So does this need a deprecation warning? Some more discussion?

@trevorstephens
Copy link
Contributor Author

Oh yeah. The idea being that the classes param could be removed from compute_class_weight

@amueller amueller added the Bug label Mar 3, 2015
@amueller
Copy link
Member
amueller commented Mar 5, 2015

The reason we need classes is that if the user actually specifies class_weights as a dict, the keys will be whatever they used as classes. And I guess we want to give an error if the user specified a class that does not appear.
But it could be that it only does not appear in the current y.
I'm not sure this actually happens, but I think it is the motivation.

trevorstephens added a commit to trevorstephens/scikit-learn that referenced this issue Nov 17, 2015
glouppe added a commit that referenced this issue Feb 11, 2016
[MRG] Fix for missing classes found in y - Fixes #4327
glemaitre pushed a commit to glemaitre/scikit-learn that referenced this issue Feb 13, 2016
mannby pushed a commit to mannby/scikit-learn that referenced this issue Apr 22, 2016
TomDLT pushed a commit to TomDLT/scikit-learn that referenced this issue Oct 3, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants
0