8000 FEAT - Implement SmoothQuantileRegression by floriankozikowski · Pull Request #312 · scikit-learn-contrib/skglm · GitHub
[go: up one dir, main page]

Skip to content

FEAT - Implement SmoothQuantileRegression #312

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 19 commits into from
Jun 25, 2025

Conversation

floriankozikowski
Copy link
Contributor
@floriankozikowski floriankozikowski commented May 23, 2025

Context of the PR

This PR implements a smooth quantile regression estimator using a Huberized loss with progressive smoothing. The goal is to provide a faster alternative to scikit-learn's QuantileRegressor while maintaining similar accuracy.
(closes #276 )
(Also it aims to simplify earlier approaches done in PR #306 )

Contributions of the PR

Added QuantileHuber loss in skglm/experimental/quantile_huber.py

Added SmoothQuantileRegressor class in skglm/experimental/smooth_quantile_regressor.py:

  • Uses FISTA solver with L1 regularization
  • Implements progressive smoothing from delta_init to delta_final
  • Includes intercept updates using gradient steps

Added example in examples/plot_smooth_quantile.py

Checks before merging PR

  • added documentation for any new feature
  • added unit tests
  • edited the what's new (if applicable)

@floriankozikowski floriankozikowski changed the title first try at simple quantile huber WIP - FEAT - Quantile Huber & Progressive Smoothing May 23, 2025
return np.mean(residuals * (quantile - (residuals < 0)))


def create_data(n_samples=1000, n_features=10, noise=0.1):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

avoid this: this is literally just wrapping make_regression

plt.tight_layout()
plt.show()


Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no need to wrap in if name == main for example plots

res += self._loss_scalar(residual)
return res / n_samples

def _loss_scalar(self, residual):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

loss_sample may be a clearer name

grad_j += -X[i, j] * self._grad_scalar(residual)
return grad_j / n_samples

def _grad_scalar(self, residual):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

having gradient_scalar and _grad_scalar is a massive risk of confusion in the future; _grad_per_sample ?

return grad_j / n_samples

def _grad_scalar(self, residual):
"""Calculate gradient for a single residual."""
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a single sample


def fit(self, X, y):
"""Fit using progressive smoothing: delta_init --> delta_final."""
X, y = check_X_y(X, y)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no need to check: GeneralizedLinearEstimator will do it


for i, delta in enumerate(deltas):
datafit = QuantileHuber(quantile=self.quantile, delta=delta)
penalty = L1(alpha=self.alpha)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

those can be taken out of the for loop

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(initialize datafit, penalty, solver and est outside of the loop; then in the loop only update the delta parameter of GLE.datafit)

solver=solver
)

if i > 0:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this way you won't need this (if est is fixed outside the loop and uses warm_start=True)

@mathurinm
Copy link
Collaborator

Ok as discussed separately, you need to implement the maths computation to make the solver work with Fista solver and AndersonCD solver; then it should be easy to support the intercept as these solvers rely on update_intercept_step (which is just a coordinate descent step on the intercept, which has a lipschitz constant equal to that of a feature which would be filled with 1s)

f"n_iter={est.n_iter_}"
)

self.est = est
Copy link
Collaborator
8000

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since this is an attribute that exists only after fitting, call it est_ with a trailing underscore, in the sklearn convention

Copy link
Collaborator
@mathurinm mathurinm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @floriankozikowski

As a sanity check I wanted to see if, with delta going to 0, we get the same solution as sklearn. However the solver seems to get stuck, in debug_quantile.py, if I use delta_final less than its current value of 0.01. Can you investigate ? Setting the inner solver to verbose mode may hep


print(
f" Stage {i+1:2d}: delta={delta:.4f}, "
f"coverage={coverage:.3f}, pinball_loss={pinball_loss:.6f}, "
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no need to print coverage here, it's a statistical value, we're more interested in the value of delta and of the loss

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

okay, I understand as its not an optimization metric and might make things too verbose, so I removed it. However, coverage is still really important to see if the quantile regression is well-calibrated. I think it might make sense to include it in the unit test or the example script to ensure the model is predicting the correct quantile level. What do you think?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm -1 on this, we focus more on the convergence part ; if the model is not adapted to the data, it's not skglm's fault
What we want to check is only that we solve the optimization problem correctly (same for LienarRegression, we do not check if the residuals are homoscedastic)

pinball_loss = np.mean(residuals * (self.quantile - (residuals < 0)))

print(
f" Stage {i+1:2d}: delta={delta:.4f}, "
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since delta goes to zero use scientific notation: delta:.2e

otherwise for delta less than 1e-4 this prints 0.0000


def predict(self, X):
"""Predict using the fitted model."""
if not hasattr(self, "est"):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

change to est_ here too

F438
def predict(self, X):
"""Predict using the fitted model."""
if not hasattr(self, "est"):
raise ValueError("Call 'fit' before 'predict'.")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

copy the error message that is printed when you predict with LinearRegression() is sklearn without fitting yet

and raise a NotFittedError

sk_est = QuantileRegressor(quantile=quantile, alpha=0.1, solver='highs').fit(X, y)
smooth_est = SmoothQuantileRegressor(
quantile=quantile,
alpha=0.1,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@floriankozikowski alpha is chosen by hand, just check that it's not too high (= it does not give you coefficients all equal to 0)

delta_init=0.5,
delta_final=0.00001,
n_deltas=15,
verbose=True,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no verbose in test usually

delta_final=0.00001,
n_deltas=15,
verbose=True,
fit_intercept=True,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

make this a parameter of the test

# Use AndersonCD solver
solver = AndersonCD(max_iter=self.max_iter, tol=self.tol,
warm_start=True, fit_intercept=self.fit_intercept,
verbose=3)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should be verbose=max(0,verbose - 1) (so that we pass verbose >= 2 corresponds to inner solver being verbose)

@mathurinm mathurinm changed the title WIP - FEAT - Quantile Huber & Progressive Smoothing FEAT - Implement SmoothQuantileRegression Jun 16, 2025
@mathurinm
Copy link
Collaborator

Only some minor changes needed, thanks @floriankozikowski !

@@ -3,3 +3,4 @@
Version 0.5 (in progress)
-------------------------
- Add support for fitting an intercept in :ref:`SqrtLasso <skglm.experimental.sqrt_lasso.SqrtLasso>` (PR: :gh:`298`)
- Add experimental QuantileHuber and SmoothQuantileRegressor for quantile regression, and an example script (PR: :gh:`312`).
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add link like in line above

@@ -0,0 +1,78 @@
"""
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

delte this file

@@ -0,0 +1,72 @@
"""
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

look at some other plotting file format

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • write some short text to explain what each 3 or 4 blocks of code does


__all__ = [
IterativeReweightedL1,
PDCD_WS,
Pinball,
SqrtQuadratic,
SqrtLasso,
QuantileHuber,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use alphabetical order

@@ -2,11 +2,14 @@
from .sqrt_lasso import SqrtLasso, SqrtQuadratic
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • edit api.rst

from numba import float64
from skglm.datafits.base import BaseDatafit
from sklearn.base import BaseEstimator, RegressorMixin
from skglm.solvers import AndersonCD
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

put all skglm imports together in their own block

quantile : float, default=0.5
Desired quantile level between 0 and 1.
delta : float, default=1.0
Width of quadratic region.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Width of quadratic region.
Smoothing parameter (0 mean no smoothing).

@mathurinm
Copy link
Collaborator

@Badr-MOUFAD LGTM, merge if happy

@floriankozikowski
Copy link
Contributor Author

@mathurinm as discussed just pushed sparsity support. Please one of you two shortly review before merging!

@mathurinm mathurinm merged commit c40d4dc into scikit-learn-contrib:main Jun 25, 2025
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
4CFF Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

PDCD_WS solver seems unstable for Pinball Loss.
2 participants
0