8000 ENH Added penalty='none' to LogisticRegression (#12860) · xhluca/scikit-learn@8adbddb · GitHub
[go: up one dir, main page]

Skip to content

Commit 8adbddb

Browse files
NicolasHugXing
authored andcommitted
ENH Added penalty='none' to LogisticRegression (scikit-learn#12860)
1 parent e2e45a0 commit 8adbddb

File tree

4 files changed

+94
-23
lines changed

4 files changed

+94
-23
lines changed

doc/modules/linear_model.rst

Lines changed: 8 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -731,7 +731,7 @@ or the log-linear classifier. In this model, the probabilities describing the po
731731
The implementation of logistic regression in scikit-learn can be accessed from
732732
class :class:`LogisticRegression`. This implementation can fit binary, One-vs-
733733
Rest, or multinomial logistic regression with optional L2, L1 or Elastic-Net
734-
regularization.
734+
regularization. Note that regularization is applied by default.
735735

736736
As an optimization problem, binary class L2 penalized logistic regression
737737
minimizes the following cost function:
@@ -771,11 +771,11 @@ classifiers. For L1 penalization :func:`sklearn.svm.l1_min_c` allows to
771771
calculate the lower bound for C in order to get a non "null" (all feature
772772
weights to zero) model.
773773

774-
The "lbfgs", "sag" and "newton-cg" solvers only support L2 penalization and
775-
are found to converge faster for some high dimensional data. Setting
776-
`multi_class` to "multinomial" with these solvers learns a true multinomial
777-
logistic regression model [5]_, which means that its probability estimates
778-
should be better calibrated than the default "one-vs-rest" setting.
774+
The "lbfgs", "sag" and "newton-cg" solvers only support L2 penalization or no
775+
regularization, and are found to converge faster for some high dimensional
776+
data. Setting `multi_class` to "multinomial" with these solvers learns a true
777+
multinomial logistic regression model [5]_, which means that its probability
778+
estimates should be better calibrated than the default "one-vs-rest" setting.
779779

780780
The "sag" solver uses a Stochastic Average Gradient descent [6]_. It is faster
781781
than other solvers for large datasets, when both the number of samples and the
@@ -808,6 +808,8 @@ The following table summarizes the penalties supported by each solver:
808808
+------------------------------+-----------------+-------------+-----------------+-----------+------------+
809809
| Elastic-Net | no | no | no | no | yes |
810810
+------------------------------+-----------------+-------------+-----------------+-----------+------------+
811+
| No penalty ('none') | no | yes | yes | yes | yes |
812+
+------------------------------+-----------------+-------------+-----------------+-----------+------------+
811813
| **Behaviors** | |
812814
+------------------------------+-----------------+-------------+-----------------+-----------+------------+
813815
| Penalize the intercept (bad) | yes | no | no | no | no |

doc/whats_new/v0.21.rst

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -101,6 +101,12 @@ Support for Python 3.4 and below has been officially dropped.
101101
:class:`linear_model.LogisticRegressionCV` now support Elastic-Net penalty,
102102
with the 'saga' solver. :issue:`11646` by :user:`Nicolas Hug <NicolasHug>`.
103103

104+
- |Enhancement| :class:`linear_model.LogisticRegression` now supports an
105+
unregularized objective by setting ``penalty`` to ``'none'``. This is
106+
equivalent to setting ``C=np.inf`` with l2 regularization. Not supported
107+
by the liblinear solver. :issue:`12860` by :user:`Nicolas Hug
108+
<NicolasHug>`.
109+
104110
- |Fix| Fixed a bug in :class:`linear_model.LogisticRegression` and
105111
:class:`linear_model.LogisticRegressionCV` with 'saga' solver, where the
106112
weights would not be correctly updated in some cases.

sklearn/linear_model/logistic.py

Lines changed: 45 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -437,13 +437,13 @@ def _check_solver(solver, penalty, dual):
437437
raise ValueError("Logistic Regression supports only solvers in %s, got"
438438
" %s." % (all_solvers, solver))
439439

440-
all_penalties = ['l1', 'l2', 'elasticnet']
440+
all_penalties = ['l1', 'l2', 'elasticnet', 'none']
441441
if penalty not in all_penalties:
442442
raise ValueError("Logistic Regression supports only penalties in %s,"
443443
" got %s." % (all_penalties, penalty))
444444

445-
if solver not in ['liblinear', 'saga'] and penalty != 'l2':
446-
raise ValueError("Solver %s supports only l2 penalties, "
445+
if solver not in ['liblinear', 'saga'] and penalty not in ('l2', 'none'):
446+
raise ValueError("Solver %s supports only 'l2' or 'none' penalties, "
447447
"got %s penalty." % (solver, penalty))
448448
if solver != 'liblinear' and dual:
449449
raise ValueError("Solver %s supports only "
@@ -452,6 +452,12 @@ def _check_solver(solver, penalty, dual):
452452
if penalty == 'elasticnet' and solver != 'saga':
453453
raise ValueError("Only 'saga' solver supports elasticnet penalty,"
454454
" got solver={}.".format(solver))
455+
456+
if solver == 'liblinear' and penalty == 'none':
457+
raise ValueError(
458+
"penalty='none' is not supported for the liblinear solver"
459+
)
460+
455461
return solver
456462

457463

@@ -1205,24 +1211,27 @@ class LogisticRegression(BaseEstimator, LinearClassifierMixin,
12051211
'sag', 'saga' and 'newton-cg' solvers.)
12061212
12071213
This class implements regularized logistic regression using the
1208-
'liblinear' library, 'newton-cg', 'sag', 'saga' and 'lbfgs' solvers. It can
1209-
handle both dense and sparse input. Use C-ordered arrays or CSR matrices
1210-
containing 64-bit floats for optimal performance; any other input format
1211-
will be converted (and copied).
1214+
'liblinear' library, 'newton-cg', 'sag', 'saga' and 'lbfgs' solvers. **Note
1215+
that regularization is applied by default**. It can handle both dense
1216+
and sparse input. Use C-ordered arrays or CSR matrices containing 64-bit
1217+
floats for optimal performance; any other input format will be converted
1218+
(and copied).
12121219
12131220
The 'newton-cg', 'sag', and 'lbfgs' solvers support only L2 regularization
1214-
with primal formulation. The 'liblinear' solver supports both L1 and L2
1215-
regularization, with a dual formulation only for the L2 penalty. The
1216-
Elastic-Net regularization is only supported by the 'saga' solver.
1221+
with primal formulation, or no regularization. The 'liblinear' solver
1222+
supports both L1 and L2 regularization, with a dual formulation only for
1223+
the L2 penalty. The Elastic-Net regularization is only supported by the
1224+
'saga' solver.
12171225
12181226
Read more in the :ref:`User Guide <logistic_regression>`.
12191227
12201228
Parameters
12211229
----------
1222-
penalty : str, 'l1', 'l2', or 'elasticnet', optional (default='l2')
1230+
penalty : str, 'l1', 'l2', 'elasticnet' or 'none', optional (default='l2')
12231231
Used to specify the norm used in the penalization. The 'newton-cg',
12241232
'sag' and 'lbfgs' solvers support only l2 penalties. 'elasticnet' is
1225-
only supported by the 'saga' solver.
1233+
only supported by the 'saga' solver. If 'none' (not supported by the
1234+
liblinear solver), no regularization is applied.
12261235
12271236
.. versionadded:: 0.19
12281237
l1 penalty with SAGA solver (allowing 'multinomial' + L1)
@@ -1289,8 +1298,10 @@ class LogisticRegression(BaseEstimator, LinearClassifierMixin,
12891298
- For multiclass problems, only 'newton-cg', 'sag', 'saga' and 'lbfgs'
12901299
handle multinomial loss; 'liblinear' is limited to one-versus-rest
12911300
schemes.
1292-
- 'newton-cg', 'lbfgs' and 'sag' only handle L2 penalty, whereas
1293-
'liblinear' and 'saga' handle L1 penalty.
1301+
- 'newton-cg', 'lbfgs', 'sag' and 'saga' handle L2 or no penalty
1302+
- 'liblinear' and 'saga' also handle L1 penalty
1303+
- 'saga' also supports 'elasticnet' penalty
1304+
- 'liblinear' does not handle no penalty
12941305
12951306
Note that 'sag' and 'saga' fast convergence is only guaranteed on
12961307
features with approximately the same scale. You can
@@ -1491,6 +1502,18 @@ def fit(self, X, y, sample_weight=None):
14911502
warnings.warn("l1_ratio parameter is only used when penalty is "
14921503
"'elasticnet'. Got "
14931504
"(penalty={})".format(self.penalty))
1505+
if self.penalty == 'none':
1506+
if self.C != 1.0: # default values
1507+
warnings.warn(
1508+
"Setting penalty='none' will ignore the C and l1_ratio "
1509+
"parameters"
1510+
)
1511+
# Note that check for l1_ratio is done right above
1512+
C_ = np.inf
1513+
penalty = 'l2'
1514+
else:
1515+
C_ = self.C
1516+
penalty = self.penalty
14941517
if not isinstance(self.max_iter, numbers.Number) or self.max_iter < 0:
14951518
raise ValueError("Maximum number of iteration must be positive;"
14961519
" got (max_iter=%r)" % self.max_iter)
@@ -1570,13 +1593,13 @@ def fit(self, X, y, sample_weight=None):
15701593
prefer = 'processes'
15711594
fold_coefs_ = Parallel(n_jobs=self.n_jobs, verbose=self.verbose,
15721595
**_joblib_parallel_args(prefer=prefer))(
1573-
path_func(X, y, pos_class=class_, Cs=[self.C],
1596+
path_func(X, y, pos_class=class_, Cs=[C_],
15741597
l1_ratio=self.l1_ratio, fit_intercept=self.fit_intercept,
15751598
tol=self.tol, verbose=self.verbose, solver=solver,
15761599
multi_class=multi_class, max_iter=self.max_iter,
15771600
class_weight=self.class_weight, check_input=False,
15781601
random_state=self.random_state, coef=warm_start_coef_,
1579-
penalty=self.penalty, max_squared_sum=max_squared_sum,
1602+
penalty=penalty, max_squared_sum=max_squared_sum,
15801603
sample_weight=sample_weight)
15811604
for class_, warm_start_coef_ in zip(classes_, warm_start_coef))
15821605

@@ -1968,6 +1991,12 @@ def fit(self, X, y, sample_weight=None):
19681991

19691992
l1_ratios_ = [None]
19701993

1994+
if self.penalty == 'none':
1995+
raise ValueError(
1996+
"penalty='none' is not useful and not supported by "
1997+
"LogisticRegressionCV."
1998+
)
1999+
19712000
X, y = check_X_y(X, y, accept_sparse='csr', dtype=np.float64,
19722001
order="C",
19732002
accept_large_sparse=solver != 'liblinear')

sklearn/linear_model/tests/test_logistic.py

Lines changed: 35 additions & 1 deletion
< 4E45 /tr>
Original file line numberDiff line numberDiff line change
@@ -234,7 +234,7 @@ def test_check_solver_option(LR):
234234

235235
# all solvers except 'liblinear' and 'saga'
236236
for solver in ['newton-cg', 'lbfgs', 'sag']:
237-
msg = ("Solver %s supports only l2 penalties, got l1 penalty." %
237+
msg = ("Solver %s supports only 'l2' or 'none' penalties," %
238238
solver)
239239
lr = LR(solver=solver, penalty='l1', multi_class='ovr')
240240
assert_raise_message(ValueError, msg, lr.fit, X, y)
@@ -253,6 +253,11 @@ def test_check_solver_option(LR):
253253
lr = LR(solver=solver, penalty='elasticnet')
254254
assert_raise_message(ValueError, msg, lr.fit, X, y)
255255

256+
# liblinear does not support penalty='none'
257+
msg = "penalty='none' is not supported for the liblinear solver"
258+
lr = LR(penalty='none', solver='liblinear')
259+
assert_raise_message(ValueError, msg, lr.fit, X, y)
260+
256261

257262
@pytest.mark.parametrize('model, params, warn_solver',
258263
[(LogisticRegression, {}, True),
@@ -1754,3 +1759,32 @@ def test_logistic_regression_path_deprecation():
17541759
assert_warns_message(DeprecationWarning,
17551760
"logistic_regression_path was deprecated",
17561761
logistic_regression_path, X, Y1)
1762+
1763+
1764+
@pytest.mark.parametrize('solver', ('lbfgs', 'newton-cg', 'sag', 'saga'))
1765+
def test_penalty_none(solver):
1766+
# - Make sure warning is raised if penalty='none' and C is set to a
1767+
# non-default value.
1768+
# - Make sure setting penalty='none' is equivalent to setting C=np.inf with
1769+
# l2 penalty.
1770+
X, y = make_classification(n_samples=1000, random_state=0)
1771+
1772+
msg = "Setting penalty='none' will ignore the C"
1773+
lr = LogisticRegression(penalty='none', solver=solver, C=4)
1774+
assert_warns_message(UserWarning, msg, lr.fit, X, y)
1775+
1776+
lr_none = LogisticRegression(penalty='none', solver=solver,
1777+
random_state=0)
1778+
lr_l2_C_inf = LogisticRegression(penalty='l2', C=np.inf, solver=solver,
1779+
random_state=0)
1780+
pred_none = lr_none.fit(X, y).predict(X)
1781+
pred_l2_C_inf = lr_l2_C_inf.fit(X, y).predict(X)
1782+
assert_array_equal(pred_none, pred_l2_C_inf)
1783+
1784+
lr = LogisticRegressionCV(penalty='none')
1785+
assert_raise_message(
1786+
ValueError,
1787+
"penalty='none' is not useful and not supported by "
1788+
"LogisticRegressionCV",
1789+
lr.fit, X, y
1790+
)

0 commit comments

Comments
 (0)
0