8000 fix: move directives out of dropdown, make pros cons a topic · scikit-learn/scikit-learn@f8132ca · GitHub
[go: up one dir, main page]

Skip to content

Commit f8132ca

committed
fix: move directives out of dropdown, make pros cons a topic
1 parent 4288734 commit f8132ca

File tree

1 file changed

+45
-49
lines changed

1 file changed

+45
-49
lines changed

doc/modules/mixture.rst

Lines changed: 45 additions & 49 deletions
Original file line numberDiff line numberDiff line change
@@ -72,27 +72,25 @@ full covariance.
7272
**Pros and cons of class GaussianMixture**
7373
|details-split|
7474

75-
Pros
76-
....
75+
.. topic:: Pros:
7776

78-
:Speed: It is the fastest algorithm for learning mixture models
77+
:Speed: It is the fastest algorithm for learning mixture models
7978

80-
:Agnostic: As this algorithm maximizes only the likelihood, it
81-
will not bias the means towards zero, or bias the cluster sizes to
82-
have specific structures that might or might not apply.
79+
:Agnostic: As this algorithm maximizes only the likelihood, it
80+
will not bias the means towards zero, or bias the cluster sizes to
81+
have specific structures that might or might not apply.
8382

84-
Cons
85-
....
83+
.. topic:: Cons:
8684

87-
:Singularities: When one has insufficiently many points per
88-
mixture, estimating the covariance matrices becomes difficult,
89-
and the algorithm is known to diverge and find solutions with
90-
infinite likelihood unless one regularizes the covariances artificially.
85+
:Singularities: When one has insufficiently many points per
86+
mixture, estimating the covariance matrices becomes difficult,
87+
and the algorithm is known to diverge and find solutions with
88+
infinite likelihood unless one regularizes the covariances artificially.
9189

92-
:Number of components: This algorithm will always use all the
93-
components it has access to, needing held-out data
94-
or information theoretical criteria to decide how many components to use
95-
in the absence of external cues.
90+
:Number of components: This algorithm will always use all the
91+
components it has access to, needing held-out data
92+
or information theoretical criteria to decide how many components to use
93+
in the absence of external cues.
9694

9795
|details-end|
9896

@@ -119,10 +117,10 @@ model.
119117
* See :ref:`sphx_glr_auto_examples_mixture_plot_gmm_selection.py` for an example
120118
of model selection performed with classical Gaussian mixture.
121119

122-
.. _expectation_maximization:
123-
124120
|details-end|
125121

122+
.. _expectation_maximization:
123+
126124
|details-start|
127125
**Estimation algorithm expectation-maximization**
128126
|details-split|
@@ -183,10 +181,10 @@ random
183181
* See :ref:`sphx_glr_auto_examples_mixture_plot_gmm_init.py` for an example of
184182
using different initializations in Gaussian Mixture.
185183

186-
.. _bgmm:
187-
188184
|details-end|
189185

186+
.. _bgmm:
187+
190188
Variational Bayesian Gaussian Mixture
191189
=====================================
192190

@@ -298,43 +296,41 @@ from the two resulting mixtures.
298296
**Pros and cons of variational inference with BayesianGaussianMixture**
299297
|details-split|
300298

301-
Pros
302-
.....
299+
.. topic:: Pros:
303300

304-
:Automatic selection: when ``weight_concentration_prior`` is small enough and
305-
``n_components`` is larger than what is found necessary by the model, the
306-
Variational Bayesian mixture model has a natural tendency to set some mixture
307-
weights values close to zero. This makes it possible to let the model choose
308-
a suitable number of effective components automatically. Only an upper bound
309-
of this number needs to be provided. Note however that the "ideal" number of
310-
active components is very application specific and is typically ill-defined
311-
in a data exploration setting.
301+
:Automatic selection: when ``weight_concentration_prior`` is small enough and
302+
``n_components`` is larger than what is found necessary by the model, the
303+
Variational Bayesian mixture model has a natural tendency to set some mixture
304+
weights values close to zero. This makes it possible to let the model choose
305+
a suitable number of effective components automatically. Only an upper bound
306+
of this number needs to be provided. Note however that the "ideal" number of
307+
active components is very application specific and is typically ill-defined
308+
in a data exploration setting.
312309

313-
:Less sensitivity to the number of parameters: unlike finite models, which will
314-
almost always use all components as much as they can, and hence will produce
315-
wildly different solutions for different numbers of components, the
316-
variational inference with a Dirichlet process prior
317-
(``weight_concentration_prior_type='dirichlet_process'``) won't change much
318-
with changes to the parameters, leading to more stability and less tuning.
310+
:Less sensitivity to the number of parameters: unlike finite models, which will
311+
almost always use all components as much as they can, and hence will produce
312+
wildly different solutions for different numbers of components, the
313+
variational inference with a Dirichlet process prior
314+
(``weight_concentration_prior_type='dirichlet_process'``) won't change much
315+
with changes to the parameters, leading to more stability and less tuning.
319316

320-
:Regularization: due to the incorporation of prior information,
321-
variational solutions have less pathological special cases than
322-
expectation-maximization solutions.
317+
:Regularization: due to the incorporation of prior information,
318+
variational solutions have less pathological special cases than
319+
expectation-maximization solutions.
323320

324321

325-
Cons
326-
.....
322+
.. topic:: Cons:
327323

328-
:Speed: the extra parametrization necessary for variational inference makes
329-
inference slower, although not by much.
324+
:Speed: the extra parametrization necessary for variational inference makes
325+
inference slower, although not by much.
330326

331-
:Hyperparameters: this algorithm needs an extra hyperparameter
332-
that might need experimental tuning via cross-validation.
327+
:Hyperparameters: this algorithm needs an extra hyperparameter
328+
that might need experimental tuning via cross-validation.
333329

334-
:Bias: there are many implicit biases in the inference algorithms (and also in
335-
the Dirichlet process if used), and whenever there is a mismatch between
336-
these biases and the data it might be possible to fit better models using a
337-
finite mixture.
330+
:Bias: there are many implicit biases in the inference algorithms (and also in
331+
the Dirichlet process if used), and whenever there is a mismatch between
332+
these biases and the data it might be possible to fit better models using a
333+
finite mixture.
338334

339335
|details-end|
340336

0 commit comments

Comments
 (0)
0