10000 Incorrect initialization of `GaussianMixture` from `precisions_init` in the `_initialize` method · Issue #26415 · scikit-learn/scikit-learn · GitHub
[go: up one dir, main page]

Skip to content

Incorrect initialization of GaussianMixture from precisions_init in the _initialize method #26415

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
mchikyt3 opened this issue May 22, 2023 · 3 comments · Fixed by #26416
Closed

Comments

@mchikyt3
Copy link
Contributor
mchikyt3 commented May 22, 2023

Describe the bug

When passing precisions_init to a GaussianMixture model, a user expects to resume training the model from the provided precision matrices, which is done by calculating the precisions_cholesky_ from precisions_init in the _initialize method and continuing EM iterations from there. However, the code is not correct in the _initialize method when the covariance_type is full or tied.

In an _m_step, the precisions_cholesky_ is calculated from the covariances_ $\Sigma$ by the _compute_precision_cholesky method. In particular, the precisions_ $\Lambda$ can decomposed as:

$$\Lambda=\Sigma^{-1}=(LL^{T})^{-1}=(L^{-1})^{T}L^{-1}=UU^{T}$$

Given the covariance matrix $\Sigma$, applying the Cholesky decomposition to it gives rise to a lower-triangular matrix $L$, and then we use back-substitution to calculate $L^{-1}$ from $L$, and finally the precisions_cholesky_ can be calculated from $U=(L^{-1})^{T}$, which is an upper-triangular matrix $U$. This is correct for calculating precisions_cholesky_ from covariances_.

However, when resuming training, the _initialize method calculates precisions_cholesky_ from precisions_init by directly conducting the Cholesky decomposition of precisions_init, which is not correct. The error can be simply verified by the fact that the resultant precisions_cholesky_ is a lower-triangular matrix which should be an upper-triangular matrix. In fact, what we need is a $UU^{T}$ decomposition. The correct math to do so is to first apply a similarity transformation to the precisions_init $\Lambda$ by an exchange matrix $J$ and then the Cholesky decomposition to the transformed $\Lambda$. In particular, the decomposition can be expressed as:

$$J\Lambda J=\tilde{L}\tilde{L}^{T}=JUJ(JUJ)^{T}$$

Finally, the precisions_cholesky_ $U$ can be calculated as $J\tilde{L}J$. It is noted that we've taken advantage of the property of $J$ that $J=J^{-1}=J^{T}$.

Steps/Code to Reproduce

import numpy as np
from sklearn.mixture import GaussianMixture
from sklearn.mixture._gaussian_mixture import (
    _estimate_gaussian_parameters,
    _compute_precision_cholesky,
)
from sklearn.utils._testing import assert_allclose

def test_gaussian_mixture_precisions_init():
    def _generate_data(n_samples, n_features, n_components):
        """Randomly generate samples and responsibilities"""
        rs = np.random.RandomState(12345)
        X = rs.random_sample((n_samples, n_features))
        resp = rs.random_sample((n_samples, n_components))
        resp /= resp.sum(axis=1)[:, np.newaxis]
        return X, resp

    def _calculate_precisions(X, resp, covariance_type):
        """Calculate precision matrix and its Cholesky decomposition"""
        reg_covar = 1e-6
        weights, means, covariances = _estimate_gaussian_parameters(
            X, resp, reg_covar, covariance_type
        )
        precisions_cholesky = _compute_precision_cholesky(covariances, covariance_type)

        _, n_components = resp.shape
        # Instantiate a `GaussianMixture` model in order to use its
        # `_set_parameters` method to compute `precisions_` from
        # `precisions_cholesky_`
        gmm = GaussianMixture(
            n_components=n_components, covariance_type=covariance_type
        )
        params = (weights, means, covariances, precisions_cholesky)
        # pylint: disable-next=protected-access
        gmm._set_parameters(params)
        return gmm.precisions_, gmm.precisions_cholesky_

    X, resp = _generate_data(n_samples=100, n_features=3, n_components=4)

    for covariance_type in ("full", "tied", "diag", "spherical"):
        # Arrange
        precisions_init, precisions_cholesky = _calculate_precisions(
            X, resp, covariance_type
        )
        desired_precisions_cholesky = precisions_cholesky

        # Act
        gmm = GaussianMixture(
            covariance_type=covariance_type, precisions_init=
8000
precisions_init
        )
        # pylint: disable-next=protected-access
        gmm._initialize(X, resp)
        actual_precisions_cholesky = gmm.precisions_cholesky_

        # Assert
        assert_allclose(actual_precisions_cholesky, desired_precisions_cholesky)

Expected Results

=================================================================== test session starts ===================================================================
platform win32 -- Python 3.10.1, pytest-7.3.1, pluggy-1.0.0 -- C:\Users\Documents\VSCode\scikit-learn\sklearn-env\Scripts\python.exe
cachedir: .pytest_cache
rootdir: C:\Users\Documents\VSCode\scikit-learn
configfile: setup.cfg
plugins: cov-4.0.0
collected 1 item

sklearn/mixture/tests/test_gaussian_mixture.py::test_gaussian_mixture_precisions_init PASSED                                                         [100%]

==================================================================== 1 passed in 0.15s ====================================================================

Actual Results

=================================================================== test session starts ===================================================================
platform win32 -- Python 3.10.1, pytest-7.3.1, pluggy-1.0.0 -- C:\Users\Documents\VSCode\scikit-learn\sklearn-env\Scripts\python.exe
cachedir: .pytest_cache
rootdir: C:\Users\Documents\VSCode\scikit-learn
configfile: setup.cfg
plugins: cov-4.0.0
collected 1 item

sklearn/mixture/tests/test_gaussian_mixture.py::test_gaussian_mixture_precisions_init FAILED                                                         [100%]

======================================================================== FAILURES =========================================================================
__________________________________________________________ test_gaussian_mixture_precisions_init __________________________________________________________

    def test_gaussian_mixture_precisions_init():
        def _generate_data(n_samples, n_features, n_components):
            """Randomly generate samples and responsibilities"""
            rs = np.random.RandomState(12345)
            X = rs.random_sample((n_samples, n_features))
            resp = rs.random_sample((n_samples, n_components))
            resp /= resp.sum(axis=1)[:, np.newaxis]
            return X, resp

        def _calculate_precisions(X, resp, covariance_type):
            """Calculate precision matrix and its Cholesky decomposition"""
            reg_covar = 1e-6
            weights, means, covariances = _estimate_gaussian_parameters(
                X, resp, reg_covar, covariance_type
            )
            precisions_cholesky = _compute_precision_cholesky(covariances, covariance_type)

            _, n_components = resp.shape
            # Instantiate a `GaussianMixture` model in order to use its
            # `_set_parameters` method to compute `precisions_` from
            # `precisions_cholesky_`
            gmm = GaussianMixture(
                n_components=n_components, covariance_type=covariance_type
            )
            params = (weights, means, covariances, precisions_cholesky)
            # pylint: disable-next=protected-access
            gmm._set_parameters(params)
            return gmm.precisions_, gmm.precisions_cholesky_

        X, resp = _generate_data(n_samples=100, n_features=3, n_components=4)

        for covariance_type in ("full", "tied", "diag", "spherical"):
            # Arrange
            precisions_init, precisions_cholesky = _calculate_precisions(
                X, resp, covariance_type
            )
            desired_precisions_cholesky = precisions_cholesky

            # Act
            gmm = GaussianMixture(
                covariance_type=covariance_type, precisions_init=precisions_init
            )
            # pylint: disable-next=protected-access
            gmm._initialize(X, resp)
            actual_precisions_cholesky = gmm.precisions_cholesky_

            # Assert
>           assert_allclose(actual_precisions_cholesky, desired_precisions_cholesky)

sklearn\mixture\tests\test_gaussian_mixture.py:1376:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
sklearn\utils\_testing.py:323: in assert_allclose
    np_assert_allclose(
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

args = (<function assert_allclose.<locals>.compare at 0x000002AD2C937400>, array([[[ 3.72597723,  0.        ,  0.        ],
 ...521,  0.65580101],
        [ 0.        ,  3.59562675, -0.25208462],
        [ 0.        ,  0.        ,  3.53699227]]]))
kwds = {'equal_nan': True, 'err_msg': '', 'header': 'Not equal to tolerance rtol=1e-07, atol=0', 'verbose': True}

    @wraps(func)
    def inner(*args, **kwds):
        with self._recreate_cm():
>           return func(*args, **kwds)
E           AssertionError: 
E           Not equal to tolerance rtol=1e-07, atol=0
E
E           Mismatched elements: 36 / 36 (100%)
E           Max absolute difference: 0.88302534
E           Max relative difference: 1.
E            x: array([[[ 3.725977,  0.      ,  0.      ],
E                   [-0.595192,  3.614806,  0.      ],
E                   [ 0.841309, -0.033688,  3.448654]],...
E            y: array([[[ 3.575666, -0.563724,  0.883025],
E                   [ 0.      ,  3.659279, -0.175359],
E                   [ 0.      ,  0.      ,  3.549951]],...

C:\Program Files\Python310\lib\contextlib.py:79: AssertionError
==================================================================== 1 failed in 0.44s ====================================================================

Versions

System:
    python: 3.10.1 (tags/v3.10.1:2cd268a, Dec  6 2021, 19:10:37) [MSC v.1929 64 bit (AMD64)]
executable: C:\Users\Documents\VSCode\scikit-learn\sklearn-env\Scripts\python.exe
   machine: Windows-10-10.0.19044-SP0

Python dependencies:
      sklearn: 1.3.dev0
          pip: 21.2.4
   setuptools: 58.1.0
        numpy: 1.24.3
        scipy: 1.10.1
       Cython: 0.29.34
       pandas: None
   matplotlib: None
       joblib: 1.2.0
threadpoolctl: 3.1.0

Built with OpenMP: True

threadpoolctl info:
       user_api: blas
   internal_api: openblas
         prefix: libopenblas
       filepath: C:\Users\Documents\VSCode\scikit-learn\sklearn-env\Lib\site-packages\numpy\.libs\libopenblas64__v0.3.21-gcc_10_3_0.dll
        version: 0.3.21
threading_layer: pthreads
   architecture: SkylakeX
    num_threads: 16

       user_api: blas
   internal_api: openblas
         prefix: libopenblas
       filepath: C:\Users\Documents\VSCode\scikit-learn\sklearn-env\Lib\site-packages\scipy.libs\libopenblas-802f9ed1179cb9c9b03d67ff79f48187.dll    
        version: 0.3.18
threading_layer: pthreads
   architecture: Prescott
    num_threads: 16

       user_api: openmp
   internal_api: openmp
         prefix: vcomp
       filepath: C:\Windows\System32\vcomp140.dll
        version: None
    num_threads: 16
@mchikyt3 mchikyt3 added Bug Needs Triage Issue requires triage labels May 22, 2023
@adrinjalali adrinjalali added Needs Investigation Issue requires investigation and removed Needs Triage Issue requires triage labels Jun 1, 2023
@adrinjalali
Copy link
Member

Thanks for the issue and the PR. If you could also point out at somebody else who can also help us with the review, it'd be nice.

@jjerphan
Copy link
Member
jjerphan commented Jun 1, 2023

Thanks for reporting this implementation error.

I confirm that the mathematical reformulation of the problem is correct, see #26416 (review) for some comments on its implementation.

@jjerphan jjerphan added module:gaussian_process and removed Needs Investigation Issue requires investigation labels Jun 1, 2023
@mchikyt3
Copy link
Contributor Author
mchikyt3 commented Jun 5, 2023

In addition, the description of the precisions_cholesky_ attribute is not correct in the docstring of GaussianMixture. In fact, precisions_cholesky_ is obtained from the $UU^T$ decomposition of the precisions_ for each Gaussian components. It is not the Cholesky decomposition. However, I'm not aware of a formal name for $UU^T$ decomposition.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants
0