@@ -72,27 +72,25 @@ full covariance.
72
72
**Pros and cons of class GaussianMixture **
73
73
|details-split |
74
74
75
- Pros
76
- ....
75
+ .. topic :: Pros:
77
76
78
- :Speed: It is the fastest algorithm for learning mixture models
77
+ :Speed: It is the fastest algorithm for learning mixture models
79
78
80
- :Agnostic: As this algorithm maximizes only the likelihood, it
81
- will not bias the means towards zero, or bias the cluster sizes to
82
- have specific structures that might or might not apply.
79
+ :Agnostic: As this algorithm maximizes only the likelihood, it
80
+ will not bias the means towards zero, or bias the cluster sizes to
81
+ have specific structures that might or might not apply.
83
82
84
- Cons
85
- ....
83
+ .. topic :: Cons:
86
84
87
- :Singularities: When one has insufficiently many points per
88
- mixture, estimating the covariance matrices becomes difficult,
89
- and the algorithm is known to diverge and find solutions with
90
- infinite likelihood unless one regularizes the covariances artificially.
85
+ :Singularities: When one has insufficiently many points per
86
+ mixture, estimating the covariance matrices becomes difficult,
87
+ and the algorithm is known to diverge and find solutions with
88
+ infinite likelihood unless one regularizes the covariances artificially.
91
89
92
- :Number of components: This algorithm will always use all the
93
- components it has access to, needing held-out data
94
- or information theoretical criteria to decide how many components to use
95
- in the absence of external cues.
90
+ :Number of components: This algorithm will always use all the
91
+ components it has access to, needing held-out data
92
+ or information theoretical criteria to decide how many components to use
93
+ in the absence of external cues.
96
94
97
95
|details-end |
98
96
@@ -119,10 +117,10 @@ model.
119
117
* See :ref: `sphx_glr_auto_examples_mixture_plot_gmm_selection.py ` for an example
120
118
of model selection performed with classical Gaussian mixture.
121
119
122
- .. _expectation_maximization :
123
-
124
120
|details-end |
125
121
122
+ .. _expectation_maximization :
123
+
126
124
|details-start |
127
125
**Estimation algorithm expectation-maximization **
128
126
|details-split |
@@ -183,10 +181,10 @@ random
183
181
* See :ref: `sphx_glr_auto_examples_mixture_plot_gmm_init.py ` for an example of
184
182
using different initializations in Gaussian Mixture.
185
183
186
- .. _bgmm :
187
-
188
184
|details-end |
189
185
186
+ .. _bgmm :
187
+
190
188
Variational Bayesian Gaussian Mixture
191
189
=====================================
192
190
@@ -298,43 +296,41 @@ from the two resulting mixtures.
298
296
**Pros and cons of variational inference with BayesianGaussianMixture **
299
297
|details-split |
300
298
301
- Pros
302
- .....
299
+ .. topic :: Pros:
303
300
304
- :Automatic selection: when ``weight_concentration_prior `` is small enough and
305
- ``n_components `` is larger than what is found necessary by the model, the
306
- Variational Bayesian mixture model has a natural tendency to set some mixture
307
- weights values close to zero. This makes it possible to let the model choose
308
- a suitable number of effective components automatically. Only an upper bound
309
- of this number needs to be provided. Note however that the "ideal" number of
310
- active components is very application specific and is typically ill-defined
311
- in a data exploration setting.
301
+ :Automatic selection: when ``weight_concentration_prior `` is small enough and
302
+ ``n_components `` is larger than what is found necessary by the model, the
303
+ Variational Bayesian mixture model has a natural tendency to set some mixture
304
+ weights values close to zero. This makes it possible to let the model choose
305
+ a suitable number of effective components automatically. Only an upper bound
306
+ of this number needs to be provided. Note however that the "ideal" number of
307
+ active components is very application specific and is typically ill-defined
308
+ in a data exploration setting.
312
309
313
- :Less sensitivity to the number of parameters: unlike finite models, which will
314
- almost always use all components as much as they can, and hence will produce
315
- wildly different solutions for different numbers of components, the
316
- variational inference with a Dirichlet process prior
317
- (``weight_concentration_prior_type='dirichlet_process' ``) won't change much
318
- with changes to the parameters, leading to more stability and less tuning.
310
+ :Less sensitivity to the number of parameters: unlike finite models, which will
311
+ almost always use all components as much as they can, and hence will produce
312
+ wildly different solutions for different numbers of components, the
313
+ variational inference with a Dirichlet process prior
314
+ (``weight_concentration_prior_type='dirichlet_process' ``) won't change much
315
+ with changes to the parameters, leading to more stability and less tuning.
319
316
320
- :Regularization: due to the incorporation of prior information,
321
- variational solutions have less pathological special cases than
322
- expectation-maximization solutions.
317
+ :Regularization: due to the incorporation of prior information,
318
+ variational solutions have less pathological special cases than
319
+ expectation-maximization solutions.
323
320
324
321
325
- Cons
326
- .....
322
+ .. topic :: Cons:
327
323
328
- :Speed: the extra parametrization necessary for variational inference makes
329
- inference slower, although not by much.
324
+ :Speed: the extra parametrization necessary for variational inference makes
325
+ inference slower, although not by much.
330
326
331
- :Hyperparameters: this algorithm needs an extra hyperparameter
332
- that might need experimental tuning via cross-validation.
327
+ :Hyperparameters: this algorithm needs an extra hyperparameter
328
+ that might need experimental tuning via cross-validation.
333
329
334
- :Bias: there are many implicit biases in the inference algorithms (and also in
335
- the Dirichlet process if used), and whenever there is a mismatch between
336
- these biases and the data it might be possible to fit better models using a
337
- finite mixture.
330
+ :Bias: there are many implicit biases in the inference algorithms (and also in
331
+ the Dirichlet process if used), and whenever there is a mismatch between
332
+ these biases and the data it might be possible to fit better models using a
333
+ finite mixture.
338
334
339
335
|details-end |
340
336
0 commit comments