@@ -954,7 +954,7 @@ Solvers
954
954
-------
955
955
956
956
The solvers implemented in the class :class: `LogisticRegression `
957
- are "liblinear", "newton-cg", "lbfgs ", "sag" and "saga":
957
+ are "lbfgs", " liblinear", "newton-cg", "newton-cholesky ", "sag" and "saga":
958
958
959
959
The solver "liblinear" uses a coordinate descent (CD) algorithm, and relies
960
960
on the excellent C++ `LIBLINEAR library
@@ -968,7 +968,7 @@ classifiers. For :math:`\ell_1` regularization :func:`sklearn.svm.l1_min_c` allo
968
968
calculate the lower bound for C in order to get a non "null" (all feature
969
969
weights to zero) model.
970
970
971
- The "lbfgs", "sag " and "newton-cg " solvers only support :math: `\ell _2 `
971
+ The "lbfgs", "newton-cg " and "sag " solvers only support :math: `\ell _2 `
972
972
regularization or no regularization, and are found to converge faster for some
973
973
high-dimensional data. Setting `multi_class ` to "multinomial" with these solvers
974
974
learns a true multinomial logistic regression model [5 ]_, which means that its
@@ -989,33 +989,41 @@ Broyden–Fletcher–Goldfarb–Shanno algorithm [8]_, which belongs to
989
989
quasi-Newton methods. The "lbfgs" solver is recommended for use for
990
990
small data-sets but for larger datasets its performance suffers. [9 ]_
991
991
992
+ The "newton-cholesky" solver is an exact Newton solver that calculates the hessian
993
+ matrix and solves the resulting linear system. It is a very good choice for
994
+ `n_samples ` >> `n_features `, but has a few shortcomings: Only :math: `\ell _2 `
995
+ regularization is supported. Furthermore, because the hessian matrix is explicitly
996
+ computed, the memory usage has a quadratic dependency on `n_features ` as well as on
997
+ `n_classes `. As a consequence, only the one-vs-rest scheme is implemented for the
998
+ multiclass case.
999
+
992
1000
The following table summarizes the penalties supported by each solver:
993
1001
994
- +------------------------------+-----------------+-------------+-----------------+-----------+------------+
995
- | | **Solvers ** |
996
- +------------------------------+-----------------+-------------+ -----------------+-----------+------------+
997
- | **Penalties ** | **'liblinear ' ** | **'lbfgs ' ** | **'newton-cg' ** | **'sag' ** | **'saga' ** |
998
- +------------------------------+-----------------+-------------+ -----------------+-----------+------------+
999
- | Multinomial + L2 penalty | no | yes | yes | yes | yes |
1000
- +------------------------------+-----------------+-------------+ -----------------+-----------+------------+
1001
- | OVR + L2 penalty | yes | yes | yes | yes | yes |
1002
- +------------------------------+-----------------+-------------+ -----------------+-----------+------------+
1003
- | Multinomial + L1 penalty | no | no | no | no | yes |
1004
- +------------------------------+-----------------+-------------+ -----------------+-----------+------------+
1005
- | OVR + L1 penalty | yes | no | no | no | yes |
1006
- +------------------------------+-----------------+-------------+ -----------------+-----------+------------+
1007
- | Elastic-Net | no | no | no | no | yes |
1008
- +------------------------------+-----------------+-------------+ -----------------+-----------+------------+
1009
- | No penalty ('none') | no | yes | yes | yes | yes |
1010
- +------------------------------+-----------------+-------------+ -----------------+-----------+------------+
1011
- | **Behaviors ** | |
1012
- +------------------------------+-----------------+-------------+ -----------------+-----------+------------+
1013
- | Penalize the intercept (bad) | yes | no | no | no | no |
1014
- +------------------------------+-----------------+-------------+ -----------------+-----------+------------+
1015
- | Faster for large datasets | no | no | no | yes | yes |
1016
- +------------------------------+-----------------+-------------+ -----------------+-----------+------------+
1017
- | Robust to unscaled datasets | yes | yes | yes | no | no |
1018
- +------------------------------+-----------------+-------------+ -----------------+-----------+------------+
1002
+ +------------------------------+-----------------+-------------+-----------------+-----------------------+----------- +------------+
1003
+ | | **Solvers ** |
1004
+ +------------------------------+-------------+ ----------------- +-----------------+------ -----------------+-----------+------------+
1005
+ | **Penalties ** | **'lbfgs ' ** | **'liblinear ' ** | **'newton-cg' ** | ** 'newton-cholesky ' ** | **'sag' ** | **'saga' ** |
1006
+ +------------------------------+-------------+ ----------------- +-----------------+------ -----------------+-----------+------------+
1007
+ | Multinomial + L2 penalty | yes | no | yes | no | yes | yes |
1008
+ +------------------------------+-------------+ ----------------- +-----------------+------ -----------------+-----------+------------+
1009
+ | OVR + L2 penalty | yes | yes | yes | yes | yes | yes |
1010
+ +------------------------------+-------------+ ----------------- +-----------------+------ -----------------+-----------+------------+
1011
+ | Multinomial + L1 penalty | no | no | no | no | no | yes |
1012
+ +------------------------------+-------------+ ----------------- +-----------------+------ -----------------+-----------+------------+
1013
+ | OVR + L1 penalty | no | yes | no | no | no | yes |
1014
+ +------------------------------+-------------+ ----------------- +-----------------+------ -----------------+-----------+------------+
1015
+ | Elastic-Net | no | no | no | no | no | yes |
1016
+ +------------------------------+-------------+ ----------------- +-----------------+------ -----------------+-----------+------------+
1017
+ | No penalty ('none') | yes | no | yes | yes | yes | yes |
1018
+ +------------------------------+-------------+ ----------------- +-----------------+------ -----------------+-----------+------------+
1019
+ | **Behaviors ** | |
1020
+ +------------------------------+-------------+ ----------------- +-----------------+------ -----------------+-----------+------------+
1021
+ | Penalize the intercept (bad) | no | yes | no | no | no | no |
1022
+ +------------------------------+-------------+ ----------------- +-----------------+------ -----------------+-----------+------------+
1023
+ | Faster for large datasets | no | no | no | no | yes | yes |
1024
+ +------------------------------+-------------+ ----------------- +-----------------+------ -----------------+-----------+------------+
1025
+ | Robust to unscaled datasets | yes | yes | yes | yes | no | no |
1026
+ +------------------------------+-------------+ ----------------- +-----------------+------ -----------------+-----------+------------+
1019
1027
1020
1028
The "lbfgs" solver is used by default for its robustness. For large datasets
1021
1029
the "saga" solver is usually faster.
0 commit comments