10000 ENH - fit intercept in ``ProxNewton`` solver by Badr-MOUFAD · Pull Request #77 · scikit-learn-contrib/skglm · GitHub
[go: up one dir, main page]

Skip to content

ENH - fit intercept in ProxNewton solver #77

New issue < 8000 details-dialog class="Box Box--overlay d-flex flex-column anim-fade-in fast overflow-auto" aria-label="Sign up for GitHub">

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 14 commits into from
Oct 16, 2022

Conversation

Badr-MOUFAD
Copy link
Collaborator
@Badr-MOUFAD Badr-MOUFAD commented Oct 3, 2022

This adds unpenalized intercept support to ProxNewton solver and closes #68.
It proceeds as follows

  • implement fit intercept
  • add unitest
  • perform benchmarks

View benchmark results
Link to benchmark repository

@Badr-MOUFAD
Copy link
Collaborator Author

here are the benchmark results and the link to repo to reproduce.

I have one concern: I cannot explain why we don't have the same initial objective value with Blitz in the case of alpha=alpha_max (reg=1)

any thoughts?

@mathurinm
Copy link
Collaborator

I'd test the output (intercept and coef_) outside of benchopt for these two solvers on the problematic data. Check that both get 0 (but I don't think they should, cf our discussion on alpha_max with intercept)

@Badr-MOUFAD
Copy link
Collaborator Author

I'd test the output (intercept and coef_) outside of benchopt for these two solvers on the problematic data. Check that both get 0 (but I don't think they should, cf our discussion on alpha_max with intercept)

The reason behind this is: Blitz performs an intercept update before entering the iter loop.
cf. https://github.com/tbjohns/BlitzL1/blob/8c1d2f6251307021ff4c8e2027df2d92b10d7f63/src/solver.cpp#L442-L443

So nothing surprising in such behavior.

@mathurinm mathurinm requested a review from QB3 October 5, 2022 08:34
past_grads = np.zeros(len(ws))
X_delta_w_ws = np.zeros(X.shape[0])
w_ws = w_epoch[ws]
ws_intercept = np.append(ws, -1) if fit_intercept else ws
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line of code if duplicated 4 times in the file.
It feels like it should be factorized, but I do not have a simple solution to do it right now.

@mathurinm mathurinm requested a review from PABannier October 9, 2022 14:45
Copy link
Collaborator
@PABannier PABannier left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Really good work @Badr-MOUFAD thanks!

Co-authored-by: PAB <pierreantoine.bannier@gmail.com>
@@ -24,6 +24,9 @@ class ProxNewton(BaseSolver):
tol : float, default 1e-4
Tolerance for convergence.

fit_intercept : bool, default True
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My experience with sklearn is that it's quite confusing to fit an intercept by default. Should it be set to False here ?

Copy link
Collaborator Author
@Badr-MOUFAD Badr-MOUFAD Oct 10, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes agree! But what about #77 (comment)?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nitpick (this can be discussed later IMO):
This means that it would be set to False by default in ProxNewton, and set to True by default in SparseLogisticRegression. This might lead to errors in the future

Copy link
Collaborator
@QB3 QB3 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the great work @Badr-MOUFAD , LGTM!

@@ -24,6 +24,9 @@ class ProxNewton(BaseSolver):
tol : float, default 1e-4
Tolerance for convergence.

fit_intercept : bool, default True
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nitpick (this can be discussed later IMO):
This means that it would be set to False by default in ProxNewton, and set to True by default in SparseLogisticRegression. This might lead to errors in the future

@mathurinm mathurinm merged commit db46d69 into scikit-learn-contrib:main Oct 16, 2022
@mathurinm
Copy link
Collaborator

Thanks @Badr-MOUFAD !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

ENH fit intercept in prox newton solver
4 participants
0