scikit-learn
diff --git a/‎doc/modules/mixture.rst
Lines changed: 27 additions & 10 deletions b/‎doc/modules/mixture.rst
Lines changed: 27 additions & 10 deletions
@@ -68,8 +68,9 @@ full covariance.
     * See :ref:`sphx_glr_auto_examples_mixture_plot_gmm_pdf.py` for an example on plotting the
       density estimation.
 
+|details-start|
 Pros and cons of class :class:`GaussianMixture`
------------------------------------------------
+|details-split|
 
 Pros
 ....
@@ -92,9 +93,12 @@ Cons
    components it has access to, needing held-out data
    or information theoretical criteria to decide how many components to use
    in the absence of external cues.
+|details-end|
 
-Selecting the number of components in a classical Gaussian Mixture Model
-------------------------------------------------------------------------
+
+|details-start|
+Selecting the number of components in a classical Gaussian Mixture model
+|details-split|
 
 The BIC criterion can be used to select the number of components in a Gaussian
 Mixture in an efficient way. In theory, it recovers the true number of
@@ -115,9 +119,12 @@ model.
       of model selection performed with classical Gaussian mixture.
 
 .. _expectation_maximization:
+|details-end|
+
 
-Estimation algorithm Expectation-maximization
------------------------------------------------
+|details-start|
+Estimation algorithm expectation-maximization
+|details-split|
 
 The main difficulty in learning Gaussian mixture models from unlabeled
 data is that one usually doesn't know which points came from
@@ -134,9 +141,11 @@ each component of the model. Then, one tweaks the
 parameters to maximize the likelihood of the data given those
 assignments. Repeating this process is guaranteed to always converge
 to a local optimum.
+|details-end|
 
-Choice of the Initialization Method
------------------------------------
+|details-start|
+Choice of the Initialization method
+|details-split|
 
 There is a choice of four initialization methods (as well as inputting user defined
 initial means) to generate the initial centers for the model components:
@@ -173,6 +182,8 @@ random
       using different initializations in Gaussian Mixture.
 
 .. _bgmm:
+|details-end|
+
 
 Variational Bayesian Gaussian Mixture
 =====================================
@@ -183,8 +194,9 @@ similar to the one defined by :class:`GaussianMixture`.
 
 .. _variational_inference:
 
+|details-start|
 Estimation algorithm: variational inference
----------------------------------------------
+|details-split|
 
 Variational inference is an extension of expectation-maximization that
 maximizes a lower bound on model evidence (including
@@ -281,10 +293,12 @@ from the two resulting mixtures.
       :class:`BayesianGaussianMixture` with different
       ``weight_concentration_prior_type`` for different values of the parameter
       ``weight_concentration_prior``.
+|details-end|
 
 
+|details-start|
 Pros and cons of variational inference with :class:`BayesianGaussianMixture`
-----------------------------------------------------------------------------
+|details-split|
 
 Pros
 .....
@@ -323,12 +337,14 @@ Cons
    the Dirichlet process if used), and whenever there is a mismatch between
    these biases and the data it might be possible to fit better models using a
    finite mixture.
+|details-end|
 
 
 .. _dirichlet_process:
 
+|details-start|
 The Dirichlet Process
----------------------
+|details-split|
 Here we describe variational inference algorithms on Dirichlet process
 mixture. The Dirichlet process is a prior probability distribution on
@@ -361,3 +377,4 @@ use, one just specifies the concentration parameter and an upper bound
 on the number of mixture components (this upper bound, assuming it is
 higher than the "true" number of components, affects only algorithmic
 complexity, not the actual number of components used).
+|details-end|