8000 DOC Add dropdowns to Module 1.5 SGD (#26647) · scikit-learn/scikit-learn@bdf36ea · GitHub
[go: up one dir, main page]

Skip to content

Commit bdf36ea

Browse files
Tech-Netiumsjeremiedbb
authored andcommitted
DOC Add dropdowns to Module 1.5 SGD (#26647)
1 parent b1485a1 commit bdf36ea

File tree

1 file changed

+12
-0
lines changed

1 file changed

+12
-0
lines changed

doc/modules/sgd.rst

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -249,6 +249,10 @@ quadratic in the number of samples.
249249
with a large number of training samples (> 10,000) for which the SGD
250250
variant can be several orders of magnitude faster.
251251

252+
|details-start|
253+
**Mathematical details**
254+
|details-split|
255+
252256
Its implementation is based on the implementation of the stochastic
253257
gradient descent. Indeed, the original optimization problem of the One-Class
254258
SVM is given by
@@ -282,6 +286,8 @@ This is similar to the optimization problems studied in section
282286
being the L2 norm. We just need to add the term :math:`b\nu` in the
283287
optimization loop.
284288

289+
|details-end|
290+
285291
As :class:`SGDClassifier` and :class:`SGDRegressor`, :class:`SGDOneClassSVM`
286292
supports averaged SGD. Averaging can be enabled by setting ``average=True``.
287293

@@ -410,6 +416,10 @@ where :math:`L` is a loss function that measures model (mis)fit and
410416
complexity; :math:`\alpha > 0` is a non-negative hyperparameter that controls
411417
the regularization strength.
412418

419+
|details-start|
420+
**Loss functions details**
421+
|details-split|
422+
413423
Different choices for :math:`L` entail different classifiers or regressors:
414424

415425
- Hinge (soft-margin): equivalent to Support Vector Classification.
@@ -431,6 +441,8 @@ Different choices for :math:`L` entail different classifiers or regressors:
431441
- Epsilon-Insensitive: (soft-margin) equivalent to Support Vector Regression.
432442
:math:`L(y_i, f(x_i)) = \max(0, |y_i - f(x_i)| - \varepsilon)`.
433443

444+
|details-end|
445+
434446
All of the above loss functions can be regarded as an upper bound on the
435447
misclassification error (Zero-one loss) as shown in the Figure below.
436448

0 commit comments

Comments
 (0)
0