Incorrect initialization of GaussianMixture from precisions_init in the _initialize method

Describe the bug

When passing precisions_init to a GaussianMixture model, a user expects to resume training the model from the provided precision matrices, which is done by calculating the precisions_cholesky_ from precisions_init in the _initialize method and continuing EM iterations from there. However, the code is not correct in the _initialize method when the covariance_type is full or tied.

In an _m_step, the precisions_cholesky_ is calculated from the covariances_ $\Sigma$ by the _compute_precision_cholesky method. In particular, the precisions_ $\Lambda$ can decomposed as:

$$\Lambda=\Sigma^{-1}=(LL^{T})^{-1}=(L^{-1})^{T}L^{-1}=UU^{T}$$

Given the covariance matrix $\Sigma$, applying the Cholesky decomposition to it gives rise to a lower-triangular matrix $L$, and then we use back-substitution to calculate $L^{-1}$ from $L$, and finally the precisions_cholesky_ can be calculated from $U=(L^{-1})^{T}$, which is an upper-triangular matrix $U$. This is correct for calculating precisions_cholesky_ from covariances_.

However, when resuming training, the _initialize method calculates precisions_cholesky_ from precisions_init by directly conducting the Cholesky decomposition of precisions_init, which is not correct. The error can be simply verified by the fact that the resultant precisions_cholesky_ is a lower-triangular matrix which should be an upper-triangular matrix. In fact, what we need is a $UU^{T}$ decomposition. The correct math to do so is to first apply a similarity transformation to the precisions_init $\Lambda$ by an exchange matrix $J$ and then the Cholesky decomposition to the transformed $\Lambda$. In particular, the decomposition can be expressed as:

$$J\Lambda J=\tilde{L}\tilde{L}^{T}=JUJ(JUJ)^{T}$$

Finally, the precisions_cholesky_ $U$ can be calculated as $J\tilde{L}J$. It is noted that we've taken advantage of the property of $J$ that $J=J^{-1}=J^{T}$.

Steps/Code to Reproduce

import numpy as np
from sklearn.mixture i
A9F7
mport GaussianMixture
from sklearn.mixture._gaussian_mixture import (
    _estimate_gaussian_parameters,
    _compute_precision_cholesky,
)
from sklearn.utils._testing import assert_allclose

def test_gaussian_mixture_precisions_init():
    def _generate_data(n_samples, n_features, n_components):
        """Randomly generate samples and responsibilities"""
        rs = np.random.RandomState(12345)
        X = rs.random_sample((n_samples, n_features))
        resp = rs.random_sample((n_samples, n_components))
        resp /= resp.sum(axis=1)[:, np.newaxis]
        return X, resp

    def _calculate_precisions(X, resp, covariance_type):
        """Calculate precision matrix and its Cholesky decomposition"""
        reg_covar = 1e-6
        weights, means, covariances = _estimate_gaussian_parameters(
            X, resp, reg_covar, covariance_type
        )
        precisions_cholesky = _compute_precision_cholesky(covariances, covariance_type)

        _, n_components = resp.shape
        # Instantiate a `GaussianMixture` model in order to use its
        # `_set_parameters` method to compute `precisions_` from
        # `precisions_cholesky_`
        gmm = GaussianMixture(
            n_components=n_components, covariance_type=covariance_type
        )
        params = (weights, means, covariances, precisions_cholesky)
        # pylint: disable-next=protected-access
        gmm._set_parameters(params)
        return gmm.precisions_, gmm.precisions_cholesky_

    X, resp = _generate_data(n_samples=100, n_features=3, n_components=4)

    for covariance_type in ("full", "tied", "diag", "spherical"):
        # Arrange
        precisions_init, precisions_cholesky = _calculate_precisions(
            X, resp, covariance_type
        )
        desired_precisions_cholesky = precisions_cholesky

        # Act
        gmm = GaussianMixture(
            covariance_type=covariance_type, precisions_init=precisions_init
        )
        # pylint: disable-next=protected-access
        gmm._initialize(X, resp)
        actual_precisions_cholesky = gmm.precisions_cholesky_

        # Assert
        assert_allclose(actual_precisions_cholesky, desired_precisions_cholesky)

Expected Results

=================================================================== test session starts ===================================================================
platform win32 -- Python 3.10.1, pytest-7.3.1, pluggy-1.0.0 -- C:\Users\Documents\VSCode\scikit-learn\sklearn-env\Scripts\python.exe
cachedir: .pytest_cache
rootdir: C:\Users\Documents\VSCode\scikit-learn
configfile: setup.cfg
plugins: cov-4.0.0
collected 1 item

sklearn/mixture/tests/test_gaussian_mixture.py::test_gaussian_mixture_precisions_init PASSED                                                         [100%]

==================================================================== 1 passed in 0.15s ====================================================================

Actual Results

=================================================================== test session starts ===================================================================
platform win32 -- Python 3.10.1, pytest-7.3.1, pluggy-1.0.0 -- C:\Users\Documents\VSCode\scikit-learn\sklearn-env\Scripts\python.exe
cachedir: .pytest_cache
rootdir: C:\Users\Documents\VSCode\scikit-learn
configfile: setup.cfg
plugins: cov-4.0.0
collected 1 item

sklearn/mixture/tests/test_gaussian_mixture.py::test_gaussian_mixture_precisions_init FAILED                                                         [100%]

======================================================================== FAILURES =========================================================================
__________________________________________________________ test_gaussian_mixture_precisions_init __________________________________________________________

    def test_gaussian_mixture_precisions_init():
        def _generate_data(n_samples, n_features, n_components):
            """Randomly generate samples and responsibilities"""
            rs = np.random.RandomState(12345)
            X = rs.random_sample((n_samples, n_features))
            resp = rs.random_sample((n_samples, n_components))
            resp /= resp.sum(axis=1)[:, np.newaxis]
            return X, resp

        def _calculate_precisions(X, resp, covariance_type):
            """Calculate precision matrix and its Cholesky decomposition"""
            reg_covar = 1e-6
            weights, means, covariances = _estimate_gaussian_parameters(
                X, resp, reg_covar, covariance_type
            )
            precisions_cholesky = _compute_precision_cholesky(covariances, covariance_type)

            _, n_components = resp.shape
            # Instantiate a `GaussianMixture` model in order to use its
            # `_set_parameters` method to compute `precisions_` from
            # `precisions_cholesky_`
            gmm = GaussianMixture(
                n_components=n_components, covariance_type=covariance_type
            )
            params = (weights, means, covariances, precisions_cholesky)
            # pylint: disable-next=protected-access
            gmm._set_parameters(params)
            return gmm.precisions_, gmm.precisions_cholesky_

        X, resp = _generate_data(n_samples=100, n_features=3, n_components=4)

        for covariance_type in ("full", "tied", "diag", "spherical"):
            # Arrange
            precisions_init, precisions_cholesky = _calculate_precisions(
                X, resp, covariance_type
            )
            desired_precisions_cholesky = precisions_cholesky

            # Act
            gmm = GaussianMixture(
                covariance_type=covariance_type, precisions_init=precisions_init
            )
            # pylint: disable-next=protected-access
            gmm._initialize(X, resp)
            actual_precisions_cholesky = gmm.precisions_cholesky_

            # Assert
>           assert_allclose(actual_precisions_cholesky, desired_precisions_cholesky)

sklearn\mixture\tests\test_gaussian_mixture.py:1376:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
sklearn\utils\_testing.py:323: in assert_allclose
    np_assert_allclose(
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

args = (<function assert_allclose.<locals>.compare at 0x000002AD2C937400>, array([[[ 3.72597723,  0.        ,  0.        ],
 ...521,  0.65580101],
        [ 0.        ,  3.59562675, -0.25208462],
        [ 0.        ,  0.        ,  3.53699227]]]))
kwds = {'equal_nan': True, 'err_msg': '', 'header': 'Not equal to tolerance rtol=1e-07, atol=0', 'verbose': True}

    @wraps(func)
    def inner(*args, **kwds):
        with self._recreate_cm():
>           return func(*args, **kwds)
E           AssertionError: 
E           Not equal to tolerance rtol=1e-07, atol=0
E
E           Mismatched elements: 36 / 36 (100%)
E           Max absolute difference: 0.88302534
E           Max relative difference: 1.
E            x: array([[[ 3.725977,  0.      ,  0.      ],
E                   [-0.595192,  3.614806,  0.      ],
E                   [ 0.841309, -0.033688,  3.448654]],...
E            y: array([[[ 3.575666, -0.563724,  0.883025],
E                   [ 0.      ,  3.659279, -0.175359],
E                   [ 0.      ,  0.      ,  3.549951]],...

C:\Program Files\Python310\lib\contextlib.py:79: AssertionError
==================================================================== 1 failed in 0.44s ====================================================================

Versions

System:
    python: 3.10.1 (tags/v3.10.1:2cd268a, Dec  6 2021, 19:10:37) [MSC v.1929 64 bit (AMD64)]
executable: C:\Users\Documents\VSCode\scikit-learn\sklearn-env\Scripts\python.exe
   machine: Windows-10-10.0.19044-SP0

Python dependencies:
      sklearn: 1.3.dev0
          pip: 21.2.4
   setuptools: 58.1.0
        numpy: 1.24.3
        scipy: 1.10.1
       Cython: 0.29.34
       pandas: None
   matplotlib: None
       joblib: 1.2.0
threadpoolctl: 3.1.0

Built with OpenMP: True

threadpoolctl info:
       user_api: blas
   internal_api: openblas
         prefix: libopenblas
       filepath: C:\Users\Documents\VSCode\scikit-learn\sklearn-env\Lib\site-packages\numpy\.libs\libopenblas64__v0.3.21-gcc_10_3_0.dll
        version: 0.3.21
threading_layer: pthreads
   architecture: SkylakeX
    num_threads: 16

       user_api: blas
   internal_api: openblas
         prefix: libopenblas
       filepath: C:\Users\Documents\VSCode\scikit-learn\sklearn-env\Lib\site-packages\scipy.libs\libopenblas-802f9ed1179cb9c9b03d67ff79f48187.dll    
        version: 0.3.18
threading_layer: pthreads
   architecture: Prescott
    num_threads: 16

       user_api: openmp
   internal_api: openmp
         prefix: vcomp
       filepath: C:\Windows\System32\vcomp140.dll
        version: None
    num_threads: 16

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Describe the bug

Steps/Code to Reproduce

Expected Results

Actual Results

Versions

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Description

Describe the bug

Steps/Code to Reproduce

Expected Results

Actual Results

Versions

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions