-
-
Notifications
You must be signed in to change notification settings - Fork 25.8k
Use sample_weight when validating LogisticRegressionCV #25906
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
I don't think this is a bug per say that we can/should fix in the current state of affairs. The result of |
As a user, for me it is also clear that it is a bug and I would appreciate it if it could be fixed sooner. However, I understand that this is part of a much deeper issue with lots of factors to consider. Knowing that there is work moving in that direction #24498 as part of SLEP006 is comforting enough. |
I have the feeling that both |
only pass |
We could special case Fixing only the |
That Stack Overflow discussion reminded me of this joke. |
so what happens if the user passes what happens when they accept it but through what if a scorer doesn't support sample weight in version x, but does it in version x+1? what if the sub-estimator doesn't support sample weights, but scorer does? we have thought about these in the context of slep6, and I don't think the proposal here to pass it always around if we can (which is not well defined), is a good solution really. |
we could warn first and raise in a later release.
best effort we pass them and hope for the best :)
warn in old version and then work correctly in new version
raise an error: why would passing sample weight to fit would make sense in this case?
What I described above would be the way we should aim for (I think). If we implement this via SLEP6, this should be the default behavior, without explicit requests. If the users configure explicit routing, they can override the default behavior. |
Considering the complications of fixing this w/o SLEP006, is the resolution to fix it in the context of the slep, or to fix it regardless? |
We add a drafting meeting on SLEP6 integration yesterday and we by using the new routing mechanism we will have a way forward to implement the good routing policy for this case with a global setting (that might be the default when SLEP6 is enabled, to be decided yet). See the working proposal here in #26050. |
For |
Metadata Routing #24027 will add much needed support for taking into account sample_weight when cross-validating. However, the current implementation of LogisticRegressionCV doesn't seem to be taking advantage of this:
scikit-learn/sklearn/linear_model/_logistic.py
Line 778 in 63e364a
Therefore, the scores used for choosing the correct hyperparameter will still be misleading even when the tools for solving this become available.
The text was updated successfully, but these errors were encountered: