FIX online updates in MiniBatchDictionaryLearning (scikit-learn#25354)

jeremiedbb · ogrisel · web-flow · commit cfd428afc5b6 · 2023-01-19T12:13:22.000Z
Co-authored-by: Olivier Grisel &lt;olivier.grisel@ensta.org&gt;
diff --git a/doc/whats_new/v1.2.rst b/doc/whats_new/v1.2.rst
@@ -9,6 +9,19 @@ Version 1.2.1
 
 **In Development**
 
+Changed models
+--------------
+
+The following estimators and functions, when fit with the same data and
+parameters, may produce different models from the previous version. This often
+occurs due to changes in the modelling logic (bug fixes or enhancements), or in
+random sampling procedures.
+
+- |Fix| The fitted components in :class:`MiniBatchDictionaryLearning` might differ. The
+  online updates of the sufficient statistics now properly take the sizes of the batches
+  into account.
+  :pr:`25354` by :user:`Jérémie du Boisberranger <jeremiedbb>`.
+
 Changelog
 ---------
 
@@ -33,6 +46,11 @@ Changelog
 :mod:`sklearn.decomposition`
 ............................
 
+- |Fix| Fixed a bug in :class:`decomposition.MiniBatchDictionaryLearning` where the
+  online updates of the sufficient statistics where not correct when calling
+  `partial_fit` on batches of different sizes.
+  :pr:`25354` by :user:`Jérémie du Boisberranger <jeremiedbb>`.
+
 - |Fix| :class:`decomposition.DictionaryLearning` better supports readonly NumPy
   arrays. In particular, it better supports large datasets which are memory-mapped
   when it is used with coordinate descent algorithms (i.e. when `fit_algorithm='cd'`).
diff --git a/sklearn/decomposition/_dict_learning.py b/sklearn/decomposition/_dict_learning.py
@@ -2053,16 +2053,16 @@ class MiniBatchDictionaryLearning(_BaseSparseCoding, BaseEstimator):
 
     We can check the level of sparsity of `X_transformed`:
 
-    >>> np.mean(X_transformed == 0)
-    0.38...
+    >>> np.mean(X_transformed == 0) < 0.5
+    True
 
     We can compare the average squared euclidean norm of the reconstruction
     error of the sparse coded signal relative to the squared euclidean norm of
     the original signal:
 
     >>> X_hat = X_transformed @ dict_learner.components_
     >>> np.mean(np.sum((X_hat - X) ** 2, axis=1) / np.sum(X ** 2, axis=1))
-    0.059...
+    0.057...
     """
 
     _parameter_constraints: dict = {
@@ -2196,9 +2196,9 @@ def _update_inner_stats(self, X, code, batch_size, step):
         beta = (theta + 1 - batch_size) / (theta + 1)
 
         self._A *= beta
-        self._A += code.T @ code
+        self._A += code.T @ code / batch_size
         self._B *= beta
-        self._B += X.T @ code
+        self._B += X.T @ code / batch_size
 
     def _minibatch_step(self, X, dictionary, random_state, step):
         """Perform the update on the dictionary for one minibatch."""