8000 [MRG+1] Ensures that partial_fit for sklearn.decomposition.Incrementa… · AishwaryaRK/scikit-learn@579ac17 · GitHub
[go: up one dir, main page]

Skip to content

Commit 579ac17

Browse files
jrbourbeauAishwaryaRK
authored andcommitted
[MRG+1] Ensures that partial_fit for sklearn.decomposition.IncrementalPCA uses float division (scikit-learn#9492)
* Ensures that partial_fit uses float division * Switches to using future division for float division * Adds non-regression test for issue scikit-learn#9489 * Updates test to remove dependence on a "known answer" * Updates doc/whats_new.rst with entry for this PR * Specifies bug fix is for Python 2 versions in doc/whats_new.rst
1 parent bc367a4 commit 579ac17

File tree

3 files changed

+48
-1
lines changed

3 files changed

+48
-1
lines changed

doc/whats_new.rst

Lines changed: 23 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -11,6 +11,18 @@ Version 0.20 (under development)
1111
Changed models
1212
--------------
1313

14+
The following estimators and functions, when fit with the same data and
15+
parameters, may produce different models from the previous version. This often
16+
occurs due to changes in the modelling logic (bug fixes or enhancements), or in
17+
random sampling procedures.
18+
19+
- :class:`decomposition.IncrementalPCA` in Python 2 (bug fix)
20+
21+
Details are listed in the changelog below.
22+
23+
(While we are trying to better inform users by providing this information, we
24+
cannot assure that this list is complete.)
25+
1426
Changelog
1527
---------
1628

@@ -24,6 +36,16 @@ Classifiers and regressors
2436
via ``n_iter_no_change``, ``validation_fraction`` and ``tol``. :issue:`7071`
2537
by `Raghav RV`_
2638

39+
Bug fixes
40+
.........
41+
42+
Decomposition, manifold learning and clustering
43+
44+
- Fixed a bug where the ``partial_fit`` method of
45+
:class:`decomposition.IncrementalPCA` used integer division instead of float
46+
division on Python 2 versions. :issue:`9492` by
47+
:user:`James Bourbeau <jrbourbeau>`.
48+
2749

2850
Version 0.19
2951
============
@@ -160,7 +182,7 @@ Model selection and evaluation
160182
:issue:`8120` by `Neeraj Gangwar`_.
161183

162184
- Added a scorer based on :class:`metrics.explained_variance_score`.
163-
:issue:`9259` by `Hanmin Qin <https://github.com/qinhanmin2014>`_.
185+
:issue:`9259` by `Hanmin Qin <https://github.com/qinhanmin2014>`_.
164186

165187
Miscellaneous
166188

sklearn/decomposition/incremental_pca.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -4,6 +4,7 @@
44
# Giorgio Patrini
55
# License: BSD 3 clause
66

7+
from __future__ import division
78
import numpy as np
89
from scipy import linalg
910

sklearn/decomposition/tests/test_incremental_pca.py

Lines changed: 24 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -273,3 +273,27 @@ def test_whitening():
273273
assert_almost_equal(X, Xinv_ipca, decimal=prec)
274274
assert_almost_equal(X, Xinv_pca, decimal=prec)
275275
assert_almost_equal(Xinv_pca, Xinv_ipca, decimal=prec)
276+
277+
278+
def test_incremental_pca_partial_fit_float_division():
279+
# Test to ensure float division is used in all versions of Python
280+
# (non-regression test for issue #9489)
281+
282+
rng = np.random.RandomState(0)
283+
A = rng.randn(5, 3) + 2
284+
B = rng.randn(7, 3) + 5
285+
286+
pca = IncrementalPCA(n_components=2)
287+
pca.partial_fit(A)
288+
# Set n_samples_seen_ to be a floating point number instead of an int
289+
pca.n_samples_seen_ = float(pca.n_samples_seen_)
290+
pca.partial_fit(B)
291+
singular_vals_float_samples_seen = pca.singular_values_
292+
293+
pca2 = IncrementalPCA(n_components=2)
294+
pca2.partial_fit(A)
295+
pca2.partial_fit(B)
296+
singular_vals_int_samples_seen = pca2.singular_values_
297+
298+
np.testing.assert_allclose(singular_vals_float_samples_seen,
299+
singular_vals_int_samples_seen)

0 commit comments

Comments
 (0)
0