FEAT - Implement SmoothQuantileRegression #312

floriankozikowski · 2025-05-23T12:36:19Z

Context of the PR

This PR implements a smooth quantile regression estimator using a Huberized loss with progressive smoothing. The goal is to provide a faster alternative to scikit-learn's QuantileRegressor while maintaining similar accuracy.
(closes #276 )
(Also it aims to simplify earlier approaches done in PR #306 )

Contributions of the PR

Added QuantileHuber loss in skglm/experimental/quantile_huber.py

Added SmoothQuantileRegressor class in skglm/experimental/smooth_quantile_regressor.py:

Uses FISTA solver with L1 regularization
Implements progressive smoothing from delta_init to delta_final
Includes intercept updates using gradient steps

Added example in examples/plot_smooth_quantile.py

Checks before merging PR

added documentation for any new feature
added unit tests
edited the what's new (if applicable)

mathurinm · 2025-05-28T11:42:45Z

examples/plot_smooth_quantile.py

+    return np.mean(residuals * (quantile - (residuals < 0)))
+
+
+def create_data(n_samples=1000, n_features=10, noise=0.1):


avoid this: this is literally just wrapping make_regression

mathurinm · 2025-05-28T11:43:06Z

examples/plot_smooth_quantile.py

+    plt.tight_layout()
+    plt.show()
+
+


no need to wrap in if name == main for example plots

mathurinm · 2025-05-28T11:43:25Z

skglm/experimental/quantile_huber.py

+            res += self._loss_scalar(residual)
+        return res / n_samples
+
+    def _loss_scalar(self, residual):


loss_sample may be a clearer name

mathurinm · 2025-05-28T11:44:08Z

skglm/experimental/quantile_huber.py

+            grad_j += -X[i, j] * self._grad_scalar(residual)
+        return grad_j / n_samples
+
+    def _grad_scalar(self, residual):


having gradient_scalar and _grad_scalar is a massive risk of confusion in the future; _grad_per_sample ?

mathurinm · 2025-05-28T11:44:16Z

skglm/experimental/quantile_huber.py

+        return grad_j / n_samples
+
+    def _grad_scalar(self, residual):
+        """Calculate gradient for a single residual."""


a single sample

mathurinm · 2025-05-28T11:44:43Z

skglm/experimental/quantile_huber.py

+
+    def fit(self, X, y):
+        """Fit using progressive smoothing: delta_init --> delta_final."""
+        X, y = check_X_y(X, y)


no need to check: GeneralizedLinearEstimator will do it

mathurinm · 2025-05-28T11:45:05Z

skglm/experimental/quantile_huber.py

+
+        for i, delta in enumerate(deltas):
+            datafit = QuantileHuber(quantile=self.quantile, delta=delta)
+            penalty = L1(alpha=self.alpha)


those can be taken out of the for loop

(initialize datafit, penalty, solver and est outside of the loop; then in the loop only update the delta parameter of GLE.datafit)

mathurinm · 2025-05-28T11:46:23Z

skglm/experimental/quantile_huber.py

+                solver=solver
+            )
+
+            if i > 0:


this way you won't need this (if est is fixed outside the loop and uses warm_start=True)

skglm/experimental/quantile_huber.py

mathurinm · 2025-05-28T11:48:49Z

Ok as discussed separately, you need to implement the maths computation to make the solver work with Fista solver and AndersonCD solver; then it should be easy to support the intercept as these solvers rely on update_intercept_step (which is just a coordinate descent step on the intercept, which has a lipschitz constant equal to that of a feature which would be filled with 1s)

…ssed feedback comments

mathurinm · 2025-06-12T14:28:56Z

skglm/experimental/quantile_huber.py

+                    f"n_iter={est.n_iter_}"
+                )
+
+        self.est = est


since this is an attribute that exists only after fitting, call it est_ with a trailing underscore, in the sklearn convention

mathurinm

Thanks @floriankozikowski

As a sanity check I wanted to see if, with delta going to 0, we get the same solution as sklearn. However the solver seems to get stuck, in debug_quantile.py, if I use delta_final less than its current value of 0.01. Can you investigate ? Setting the inner solver to verbose mode may hep

skglm/experimental/quantile_huber.py

mathurinm · 2025-06-13T06:28:15Z

skglm/experimental/quantile_huber.py

+
+                print(
+                    f"  Stage {i+1:2d}: delta={delta:.4f}, "
+                    f"coverage={coverage:.3f}, pinball_loss={pinball_loss:.6f}, "


no need to print coverage here, it's a statistical value, we're more interested in the value of delta and of the loss

okay, I understand as its not an optimization metric and might make things too verbose, so I removed it. However, coverage is still really important to see if the quantile regression is well-calibrated. I think it might make sense to include it in the unit test or the example script to ensure the model is predicting the correct quantile level. What do you think?

I'm -1 on this, we focus more on the convergence part ; if the model is not adapted to the data, it's not skglm's fault
What we want to check is only that we solve the optimization problem correctly (same for LienarRegression, we do not check if the residuals are homoscedastic)

mathurinm · 2025-06-13T06:29:41Z

skglm/experimental/quantile_huber.py

+                pinball_loss = np.mean(residuals * (self.quantile - (residuals < 0)))
+
+                print(
+                    f"  Stage {i+1:2d}: delta={delta:.4f}, "


since delta goes to zero use scientific notation: delta:.2e

otherwise for delta less than 1e-4 this prints 0.0000

mathurinm · 2025-06-13T06:30:13Z

skglm/experimental/quantile_huber.py

+
+    def predict(self, X):
+        """Predict using the fitted model."""
+        if not hasattr(self, "est"):


change to est_ here too

mathurinm · 2025-06-13T06:31:20Z

skglm/experimental/quantile_huber.py

+    def predict(self, X):
+        """Predict using the fitted model."""
+        if not hasattr(self, "est"):
+            raise ValueError("Call 'fit' before 'predict'.")


copy the error message that is printed when you predict with LinearRegression() is sklearn without fitting yet

and raise a NotFittedError

…ercept update

…skglm into quantilehuber

skglm/experimental/tests/test_quantile_huber.py

mathurinm · 2025-06-16T11:21:18Z

skglm/experimental/tests/test_quantile_huber.py

+    sk_est = QuantileRegressor(quantile=quantile, alpha=0.1, solver='highs').fit(X, y)
+    smooth_est = SmoothQuantileRegressor(
+        quantile=quantile,
+        alpha=0.1,


@floriankozikowski alpha is chosen by hand, just check that it's not too high (= it does not give you coefficients all equal to 0)

mathurinm · 2025-06-16T11:21:34Z

skglm/experimental/tests/test_quantile_huber.py

+        delta_init=0.5,
+        delta_final=0.00001,
+        n_deltas=15,
+        verbose=True,


no verbose in test usually

mathurinm · 2025-06-16T11:21:45Z

skglm/experimental/tests/test_quantile_huber.py

+        delta_final=0.00001,
+        n_deltas=15,
+        verbose=True,
+        fit_intercept=True,


make this a parameter of the test

mathurinm · 2025-06-16T11:23:21Z

skglm/experimental/quantile_huber.py

+        # Use AndersonCD solver
+        solver = AndersonCD(max_iter=self.max_iter, tol=self.tol,
+                            warm_start=True, fit_intercept=self.fit_intercept,
+                            verbose=3)


this should be verbose=max(0,verbose - 1) (so that we pass verbose >= 2 corresponds to inner solver being verbose)

mathurinm · 2025-06-16T11:32:39Z

Only some minor changes needed, thanks @floriankozikowski !

mathurinm · 2025-06-17T15:28:42Z

doc/changes/0.5.rst

@@ -3,3 +3,4 @@
 Version 0.5 (in progress)
 -------------------------
 - Add support for fitting an intercept in :ref:`SqrtLasso <skglm.experimental.sqrt_lasso.SqrtLasso>` (PR: :gh:`298`)
+- Add experimental QuantileHuber and SmoothQuantileRegressor for quantile regression, and an example script (PR: :gh:`312`).


Add link like in line above

mathurinm · 2025-06-17T15:28:49Z

debug_quantile.py

@@ -0,0 +1,78 @@
+"""


delte this file

mathurinm · 2025-06-17T15:29:13Z

examples/plot_smooth_quantile.py

@@ -0,0 +1,72 @@
+"""


look at some other plotting file format

write some short text to explain what each 3 or 4 blocks of code does

mathurinm · 2025-06-17T15:30:36Z

skglm/experimental/__init__.py


 __all__ = [
    IterativeReweightedL1,
    PDCD_WS,
    Pinball,
    SqrtQuadratic,
    SqrtLasso,
+    QuantileHuber,


use alphabetical order

mathurinm · 2025-06-17T15:31:31Z

skglm/experimental/__init__.py

@@ -2,11 +2,14 @@
 from .sqrt_lasso import SqrtLasso, SqrtQuadratic


edit api.rst

mathurinm · 2025-06-17T15:32:00Z

skglm/experimental/quantile_huber.py

+from numba import float64
+from skglm.datafits.base import BaseDatafit
+from sklearn.base import BaseEstimator, RegressorMixin
+from skglm.solvers import AndersonCD


put all skglm imports together in their own block

mathurinm · 2025-06-17T15:32:58Z

skglm/experimental/quantile_huber.py

+    quantile : float, default=0.5
+        Desired quantile level between 0 and 1.
+    delta : float, default=1.0
+        Width of quadratic region.


Suggested change

Width of quadratic region.

Smoothing parameter (0 mean no smoothing).

…webpage

mathurinm · 2025-06-23T19:00:07Z

@Badr-MOUFAD LGTM, merge if happy

floriankozikowski · 2025-06-24T15:55:18Z

@mathurinm as discussed just pushed sparsity support. Please one of you two shortly review before merging!

first try at simple quantile huber

21f1459

floriankozikowski changed the title ~~first try at simple quantile huber~~ WIP - FEAT - Quantile Huber & Progressive Smoothing May 23, 2025

floriankozikowski mentioned this pull request May 23, 2025

BUG & FEAT Fix convergence issues with Pinball loss on large datasets (Issue #276) #306

Closed

3 tasks

floriankozikowski and others added 3 commits May 26, 2025 16:21

make basic version without intercept handling and progressive smoothing

010c399

add progressive smoothing

e4888d9

din't inherit from Huber

be991c1

mathurinm reviewed May 28, 2025

View reviewed changes

floriankozikowski added 3 commits May 30, 2025 12:40

implemented lipschitz for dense case, support for AndersonCD and adre…

575ffbb

…ssed feedback comments

resolve merge conflict

938d842

add intercept method (only works for AndersonCD so far)

39ced0d

mathurinm reviewed Jun 12, 2025

View reviewed changes

script debug quantile

4375b4c

mathurinm requested changes Jun 13, 2025

View reviewed changes

mathurinm and others added 5 commits June 13, 2025 12:19

set inner solver to verbose for debug + pinpoint failing case

3c9c320

check fit_intercept

693ca06

remove FISTA, implement comments, add unit test, add lipschitz on int…

303b167

…ercept update

Merge branch 'quantilehuber' of https://github.com/floriankozikowski/…

f9e2e79

…skglm into quantilehuber

remove solver selection from plotting example, fixes CircleCI

2c84a44

mathurinm reviewed Jun 16, 2025

View reviewed changes

mathurinm mentioned this pull request Jun 16, 2025

PDCD_WS solver seems unstable for Pinball Loss. #276

Closed

mathurinm changed the title ~~WIP - FEAT - Quantile Huber & Progressive Smoothing~~ FEAT - Implement SmoothQuantileRegression Jun 16, 2025

floriankozikowski added 4 commits June 16, 2025 14:14

add fit_intercept=False to pytest, but failing

7d7106c

parametrize intercept works now in unit test, still warnings though

34b083d

adress remaining comments, loosen tolerance, warnings are still there F42D

ba3d6b8

suppress warnings, loosen tolerance, edit whats new

f43c190

mathurinm reviewed Jun 17, 2025

View reviewed changes

debug_quantile.py Outdated

@@ -0,0 +1,78 @@

"""

Copy link

Collaborator

mathurinm Jun 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

delte this file

floriankozikowski reacted with thumbs up emoji

mathurinm reviewed Jun 17, 2025

View reviewed changes

adress final comments (api, remove debug, etc.), improve example for …

cb0c3e4

…webpage

floriankozikowski marked this pull request as ready for review June 19, 2025 08:22

floriankozikowski mentioned this pull request Jun 20, 2025

ENH — Add skglm SmoothQuantileRegressor solver benchopt/benchmark_quantile_regression#13

Draft

mathurinm approved these changes Jun 23, 2025

View reviewed changes

add sparsity support for quantilehuber

08ca5d8

mathurinm approved these changes Jun 25, 2025

View reviewed changes

mathurinm merged commit c40d4dc into scikit-learn-contrib:main Jun 25, 2025
4 checks passed

		return np.mean(residuals * (quantile - (residuals < 0)))


		def create_data(n_samples=1000, n_features=10, noise=0.1):

		@@ -2,11 +2,14 @@
		from .sqrt_lasso import SqrtLasso, SqrtQuadratic

	Width of quadratic region.
	Smoothing parameter (0 mean no smoothing).

FEAT - Implement SmoothQuantileRegression #312

FEAT - Implement SmoothQuantileRegression #312

Uh oh!

Conversation

Uh oh!

Context of the PR

Contributions of the PR

Checks before merging PR

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!