@@ -68,8 +68,9 @@ full covariance.
68
68
* See :ref: `sphx_glr_auto_examples_mixture_plot_gmm_pdf.py ` for an example on plotting the
69
69
density estimation.
70
70
71
+ |details-start |
71
72
Pros and cons of class :class: `GaussianMixture `
72
- -----------------------------------------------
73
+ | details-split |
73
74
74
75
Pros
75
76
....
92
93
components it has access to, needing held-out data
93
94
or information theoretical criteria to decide how many components to use
94
95
in the absence of external cues.
96
+ |details-end |
95
97
96
- Selecting the number of components in a classical Gaussian Mixture Model
97
- ------------------------------------------------------------------------
98
+
99
+ |details-start |
100
+ Selecting the number of components in a classical Gaussian Mixture model
101
+ |details-split |
98
102
99
103
The BIC criterion can be used to select the number of components in a Gaussian
100
104
Mixture in an efficient way. In theory, it recovers the true number of
@@ -115,9 +119,12 @@ model.
115
119
of model selection performed with classical Gaussian mixture.
116
120
117
121
.. _expectation_maximization :
122
+ |details-end |
123
+
118
124
119
- Estimation algorithm Expectation-maximization
120
- -----------------------------------------------
125
+ |details-start |
126
+ Estimation algorithm expectation-maximization
127
+ |details-split |
121
128
122
129
The main difficulty in learning Gaussian mixture models from unlabeled
123
130
data is that one usually doesn't know which points came from
@@ -134,9 +141,11 @@ each component of the model. Then, one tweaks the
134
141
parameters to maximize the likelihood of the data given those
135
142
assignments. Repeating this process is guaranteed to always converge
136
143
to a local optimum.
144
+ |details-end |
137
145
138
- Choice of the Initialization Method
139
- -----------------------------------
146
+ |details-start |
147
+ Choice of the Initialization method
148
+ |details-split |
140
149
141
150
There is a choice of four initialization methods (as well as inputting user defined
142
151
initial means) to generate the initial centers for the model components:
@@ -173,6 +182,8 @@ random
173
182
using different initializations in Gaussian Mixture.
174
183
175
184
.. _bgmm :
185
+ |details-end |
186
+
176
187
177
188
Variational Bayesian Gaussian Mixture
178
189
=====================================
@@ -183,8 +194,9 @@ similar to the one defined by :class:`GaussianMixture`.
183
194
184
195
.. _variational_inference :
185
196
197
+ |details-start |
186
198
Estimation algorithm: variational inference
187
- ---------------------------------------------
199
+ | details-split |
188
200
189
201
Variational inference is an extension of expectation-maximization that
190
202
maximizes a lower bound on model evidence (including
@@ -281,10 +293,12 @@ from the two resulting mixtures.
281
293
:class: `BayesianGaussianMixture ` with different
282
294
``weight_concentration_prior_type `` for different values of the parameter
283
295
``weight_concentration_prior ``.
296
+ |details-end |
284
297
285
298
299
+ |details-start |
286
300
Pros and cons of variational inference with :class: `BayesianGaussianMixture `
287
- ----------------------------------------------------------------------------
301
+ | details-split |
288
302
289
303
Pros
290
304
.....
@@ -323,12 +337,14 @@ Cons
323
337
the Dirichlet process if used), and whenever there is a mismatch between
324
338
these biases and the data it might be possible to fit better models using a
325
339
finite mixture.
340
+ |details-end |
326
341
327
342
328
343
.. _dirichlet_process :
329
344
345
+ |details-start |
330
346
The Dirichlet Process
331
- ---------------------
347
+ | details-split |
332
348
333
349
Here we describe variational inference algorithms on Dirichlet process
334
350
mixture. The Dirichlet process is a prior probability distribution on
@@ -361,3 +377,4 @@ use, one just specifies the concentration parameter and an upper bound
361
377
on the number of mixture components (this upper bound, assuming it is
362
378
higher than the "true" number of components, affects only algorithmic
363
379
complexity, not the actual number of components used).
380
+ |details-end |
0 commit comments