@@ -49,6 +49,10 @@ class LinearSVC(LinearClassifierMixin, SparseCoefMixin, BaseEstimator):
49
49
penalties and loss functions and should scale better to large numbers of
50
50
samples.
51
51
52
+ The main differences between :class:`~sklearn.svm.LinearSVC` and
53
+ :class:`~sklearn.svm.SVC` lie in the loss function used by default, and in
54
+ the handling of intercept regularization between those two implementations.
55
+
52
56
This class supports both dense and sparse input and the multiclass support
53
57
is handled according to a one-vs-the-rest scheme.
54
58
@@ -99,20 +103,26 @@ class LinearSVC(LinearClassifierMixin, SparseCoefMixin, BaseEstimator):
99
103
will be ignored.
100
104
101
105
fit_intercept : bool, default=True
102
- Whether to calculate the intercept for this model. If set
103
- to false, no intercept will be used in calculations
104
- (i.e. data is expected to be already centered).
106
+ Whether or not to fit an intercept. If set to True, the feature vector
107
+ is extended to include an intercept term: `[x_1, ..., x_n, 1]`, where
108
+ 1 corresponds to the intercept. If set to False, no intercept will be
109
+ used in calculations (i.e. data is expected to be already centered).
105
110
106
111
intercept_scaling : float, default=1.0
107
- When self.fit_intercept is True, instance vector x becomes
108
- ``[x, self.intercept_scaling]``,
109
- i.e. a "synthetic" feature with constant value equals to
110
- intercept_scaling is appended to the instance vector.
111
- The intercept becomes intercept_scaling * synthetic feature weight
112
- Note! the synthetic feature weight is subject to l1/l2 regularization
113
- as all other features.
114
- To lessen the effect of regularization on synthetic feature weight
115
- (and therefore on the intercept) intercept_scaling has to be increased.
112
+ When `fit_intercept` is True, the instance vector x becomes ``[x_1,
113
+ ..., x_n, intercept_scaling]``, i.e. a "synthetic" feature with a
114
+ constant value equal to `intercept_scaling` is appended to the instance
115
+ vector. The intercept becomes intercept_scaling * synthetic feature
116
+ weight. Note that liblinear internally penalizes the intercept,
117
+ treating it like any other term in the feature vector. To reduce the
118
+ impact of the regularization on the intercept, the `intercept_scaling`
119
+ parameter can be set to a value greater than 1; the higher the value of
120
+ `intercept_scaling`, the lower the impact of regularization on it.
121
+ Then, the weights become `[w_x_1, ..., w_x_n,
122
+ w_intercept*intercept_scaling]`, where `w_x_1, ..., w_x_n` represent
123
+ the feature weights and the intercept weight is scaled by
124
+ `intercept_scaling`. This scaling allows the intercept term to have a
125
+ different regularization behavior compared to the other features.
116
126
117
127
class_weight : dict or 'balanced', default=None
118
128
Set the parameter C of class i to ``class_weight[i]*C`` for
@@ -362,6 +372,10 @@ class LinearSVR(RegressorMixin, LinearModel):
362
372
penalties and loss functions and should scale better to large numbers of
363
373
samples.
364
374
375
+ The main differences between :class:`~sklearn.svm.LinearSVR` and
376
+ :class:`~sklearn.svm.SVR` lie in the loss function used by default, and in
377
+ the handling of intercept regularization between those two implementations.
378
+
365
379
This class supports both dense and sparse input.
366
380
367
381
Read more in the :ref:`User Guide <svm_regression>`.
@@ -389,20 +403,26 @@ class LinearSVR(RegressorMixin, LinearModel):
389
403
loss ('squared_epsilon_insensitive') is the L2 loss.
390
404
391
405
fit_intercept : bool, default=True
392
- Whether to calculate the intercept for this model. If set
393
- to false, no intercept will be used in calculations
394
- (i.e. data is expected to be already centered).
406
+ Whether or not to fit an intercept. If set to True, the feature vector
407
+ is extended to include an intercept term: `[x_1, ..., x_n, 1]`, where
408
+ 1 corresponds to the intercept. If set to False, no intercept will be
409
+ used in calculations (i.e. data is expected to be already centered).
395
410
396
411
intercept_scaling : float, default=1.0
397
- When self.fit_intercept is True, instance vector x becomes
398
- [x, self.intercept_scaling],
399
- i.e. a "synthetic" feature with constant value equals to
400
- intercept_scaling is appended to the instance vector.
401
- The intercept becomes intercept_scaling * synthetic feature weight
402
- Note! the synthetic feature weight is subject to l1/l2 regularization
403
- as all other features.
404
- To lessen the effect of regularization on synthetic feature weight
405
- (and therefore on the intercept) intercept_scaling has to be increased.
412
+ When `fit_intercept` is True, the instance vector x becomes `[x_1, ...,
413
+ x_n, intercept_scaling]`, i.e. a "synthetic" feature with a constant
414
+ value equal to `intercept_scaling` is appended to the instance vector.
415
+ The intercept becomes intercept_scaling * synthetic feature weight.
416
+ Note that liblinear internally penalizes the intercept, treating it
417
+ like any other term in the feature vector. To reduce the impact of the
418
+ regularization on the intercept, the `intercept_scaling` parameter can
419
+ be set to a value greater than 1; the higher the value of
420
+ `intercept_scaling`, the lower the impact of regularization on it.
421
+ Then, the weights become `[w_x_1, ..., w_x_n,
422
+ w_intercept*intercept_scaling]`, where `w_x_1, ..., w_x_n` represent
423
+ the feature weights and the intercept weight is scaled by
424
+ `intercept_scaling`. This scaling allows the intercept term to have a
425
+ different regularization behavior compared to the other features.
406
426
407
427
dual : "auto" or bool, default=True
408
428
Select the algorithm to either solve the dual or primal
@@ -462,8 +482,8 @@ class LinearSVR(RegressorMixin, LinearModel):
462
482
same library as this class (liblinear).
463
483
464
484
SVR : Implementation of Support Vector Machine regression using libsvm:
465
- the kernel can be non-linear but its SMO algorithm does not
466
- scale to large number of samples as LinearSVC does.
485
+ the kernel can be non-linear but its SMO algorithm does not scale to
486
+ large number of samples as :class:`~sklearn.svm.LinearSVR` does.
467
487
468
488
sklearn.linear_model.SGDRegressor : SGDRegressor can optimize the same cost
469
489
function as LinearSVR
0 commit comments