8000 [MRG] IterativeImputer: n_iter->max_iter by sergeyf · Pull Request #13061 · scikit-learn/scikit-learn · GitHub
[go: up one dir, main page]

Skip to content

[MRG] IterativeImputer: n_iter->max_iter #13061

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

sergeyf
Copy link
Contributor
@sergeyf sergeyf commented Jan 28, 2019

This PR addresses a number of discussion, most recently in #11977.

The main purpose of this PR is to provide automatic early stopping. We are going to be using the early stopping rule that is used in the missForest R package:

The stopping criterion γ is met as soon as the difference between the newly imputed data matrix and the previous one increases for the first time with respect to both variable types, if present.

This is a sensible criterion, and has the added benefit of not needing t 10000 o specify a tol.

Note that this criterion is not applied when sample_posterior=True because there is no steady state.

This PR does a few more things:

  1. Reorders the parameters list to be a bit more intuitive.
  2. Fixes a bug when sample_posterior=True using truncnormal to sample from the posterior.

@sergeyf
Copy link
Contributor Author
sergeyf commented Jan 28, 2019

@jnothman @glemaitre

Note that the new test test_iterative_imputer_clip_truncnorm may be currently failing during fit_transform line. This is because the mus are ALREADY outside of min and max clip values, so the truncnorm samples infinities.

I've pushed a proposed fix, but I don't know if it's statistically valid. Opinion appreciated.

Copy link
Member
@jnothman jnothman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So this early stopping criterion avoids dangerous chaos, but does not avoid unnecessary fits and possible minor perturbations when, say, the result has converged but is not about to blow out.

'reached. Using result of round %d' % i_rnd)
break
else:
Xt_previous = np.copy(Xt)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't get why we're copying in both the if and the else here. I think maybe this one is needed if Xt keeps being updated, but the other is not.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right. we can do this in the if branch: Xt = Xt_previous.

# stop early if difference between consecutive imputations goes up.
# if so, back off to previous imputation
if not self.sample_posterior:
norm_diff = np.linalg.norm(Xt - Xt_previous) \
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use parentheses rather than \ for line continuation.

Is this an obvious definition of "the difference between imputations"?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not obvious but it's documented here: https://academic.oup.com/bioinformatics/article/28/1/112/219101

I'll add it in a comment.

imputed_values[~good_sigmas] = mus[~good_sigmas]
mus = mus[good_sigmas]
sigmas = sigmas[good_sigmas]
# two types of problems: (1) non-positive sigmas, (2) mus outside
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jnothman @benlawson This is my proposed solution to the problem of having mus outside of the legal min_value and max_value. However, after implementing this, now this test fails: test_iterative_imputer_truncated_normal_posterior. So something is off here...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, what's happening is that the first prediction via mus, sigmas = predictor.predict(X_test, return_std=True) is outside of the min/max so it never gets to the truncated draw, and thus the imputations from imputations = np.array([imputer.transform(X)[0][0] for _ in range(1000)]) are all 0.5.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed the random seed, which allows the first draw to be between min/max bounds, and thus results in the test passing, I also turned down 1000 draws to 100 as it seemed unnecessarily slow.

@sergeyf
Copy link
Contributor Author
sergeyf commented Jan 29, 2019

I agree that So this early stopping criterion avoids dangerous chaos, but does not avoid unnecessary fits and possible minor perturbations when, say, the result has converged but is not about to blow out, but I haven't seen that kind of issue anywhere. All of the current examples are now stopping earlier. I think given that it's in use by another popular imputation package, and I haven't seen empirical evidence of what you're describing, and it conveniently has no tol to tune - it's a good approach.

@sergeyf
Copy link
Contributor Author
sergeyf commented Jan 29, 2019

One argument against this early stopping criterion is it broke the impute.rst test. It used to be:

    >>> import numpy as np
    >>> from sklearn.impute import IterativeImputer
    >>> imp = IterativeImputer(max_iter=10, random_state=0)
    >>> imp.fit([[1, 2], [3, 6], [4, 8], [np.nan, 3], [7, np.nan]])  # doctest: +NORMALIZE_WHITESPACE
    IterativeImputer(imputation_order='ascending', initial_strategy='mean',
                     max_iter=10, max_value=None, min_value=None,
                     missing_values=nan, n_nearest_features=None, predictor=None,
                     random_state=0, sample_posterior=False, verbose=0)
    >>> X_test = [[np.nan, 2], [6, np.nan], [np.nan, 6]]
    >>> # the model learns that the second feature is double the first
    >>> print(np.round(imp.transform(X_test)))
    [[ 1.  2.]
     [ 6. 12.]
     [ 3.  6.]]

But now, on my machine, early stopping happens after the 2nd iteration and you get:

[[1. 2.]
 [6. 7.]
 [4. 6.]]

Which is clearly wrong based on the training data. OK, I'm open to considering a tol based solution again!

@jnothman
Copy link
Member
jnothman commented Jan 29, 2019 via email

Copy link
Member
@jnothman jnothman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think we should issue a ConvergenceWarning if we reach max_iter, or is it not really important for imputation?

I hope to give this a bigger look later/tomorrow.

@@ -511,7 +514,7 @@ class IterativeImputer(BaseEstimator, TransformerMixin):
``feat_idx`` is the current feature to be imputed,
``neighbor_feat_idx`` is the array of other features used to impute the
current feature, and ``predictor`` is the trained predictor used for
the imputation. Length is ``self.n_features_with_missing_ * n_iter``.
the imputation. Length is ``self.n_features_with_missing_ * max_iter``.

n_features_with_missing_ : int
Number of features with missing values.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please document n_iter_

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be another list item rather than a note in the above attribute description

@@ -511,7 +514,7 @@ class IterativeImputer(BaseEstimator, TransformerMixin):
``feat_idx`` is the current feature to be imputed,
``neighbor_feat_idx`` is the array of other features used to impute the
current feature, and ``predictor`` is the trained predictor used for
the imputation. Length is ``self.n_features_with_missing_ * n_iter``.
the imputation. Length is ``self.n_features_with_missing_ * max_iter``.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

max_iter -> n_iter_

@jnothman
Copy link
Member
jnothman commented Jan 29, 2019 via email

@sergeyf
Copy link
Contributor Author
sergeyf commented Jan 29, 2019

Haha, now you've got me convinced about tol =)

I'll try that for now - it's what you had originally: max(abs(X - Xprev)) < tol

Setting tol as a default of 1e-3 fixes the impute.rst example immediately. Let's see if all the tests pass as well.

@sergeyf
Copy link
Contributor Author
sergeyf commented Jan 29, 2019

Re: benchmarks. I'll do some tomorrow (it's 9pm here).

Re: convergence warning. Can't hurt. I'll find an example and stick it in here.

@jnothman
Copy link
Member
jnothman commented Jan 29, 2019 via email

Copy link
Member
@jnothman jnothman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be nice if the max_iter change here was pulled out from the rest..?

@@ -511,7 +514,7 @@ class IterativeImputer(BaseEstimator, TransformerMixin):
``feat_idx`` is the current feature to be imputed,
``neighbor_feat_idx`` is the array of other features used to impute the
current feature, and ``predictor`` is the trained predictor used for
the imputation. Length is ``self.n_features_with_missing_ * n_iter``.
the imputation. Length is ``self.n_features_with_missing_ * max_iter``.

n_features_with_missing_ : int
Number of features with missing values.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be another list item rather than a note in the above attribute description

verbose=1,
random_state=rng)
X_filled_100 = imputer.fit_transform(X_missing)
assert(len(imputer.imputation_sequence_) == d * 5)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So that the code is clear you can check n_iter_ directly

Can also check that tol=0 results in n_iter_ == max_iter?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and remove the outer brackets

the imputation. Length is ``self.n_features_with_missing_ * n_iter``.
the imputation. Length is ``self.n_features_with_missing_ *
self.n_iter_``. ``self.n_iter_`` is the number of iteration rounds that
actually occurred, taking early stopping into account.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove actually

Maximum number of imputation rounds to perform before returning the
imputations computed during the final round. A round is a single
imputation of each feature with missing values. The stopping criterion
is met once abs(max(X_i - X_{i-1})) < tol. Note that early stopping is
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Put the backticks around the formulation of the stopping criterion

mus_too_high = mus > self._max_value
imputed_values[mus_too_high] = self._max_value
# the rest can be sampled without statistical issues
sample_flag = positive_sigmas & ~mus_too_low & ~mus_too_high
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would use the term mask instead of flag. Maybe inrange_mask

self.n_iter_ = self.max_iter
if not self.sample_posterior:
Xt_previous = np.copy(Xt)
for i_rnd in range(self.max_iter):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer a while loop

while criterion > self.tol or self.n_iter_ < self.max_iter

The reasoning is that we already see that we will have an early stopping mechanism

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You also need if not self.sample_posterior: so it gets messier. I would prefer to leave as is.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idiom I have come to enjoy is:

for self.n_iter_ in range(1, self.max_iter + 1):
    update model

    if stopping condition is met:
        break
else:
    warnings.warng(...)

(We have something similar in label_propagation.py, but there the stopping condition applies at the start of the loop)

for i_rnd in range(self.n_iter):
self.n_iter_ = self.max_iter
if not self.sample_posterior:
Xt_previous = np.copy(Xt)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

np.copy(Xt) -> Xt.copy()

print('[IterativeImputer] Early stopping criterion '
'reached.')
break
else:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can remove the else:

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think that's true. I need to specify that Xt_previous = Xt.copy() otherwise the subsequent diff will be 0. Try it out.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what I mean is

            if not self.sample_posterior:
                inf_norm = np.linalg.norm(Xt - Xt_previous, ord=np.inf,
                                          axis=None)
                if inf_norm < normalized_tol:
                    self.n_iter_ = i_rnd + 1
                    if self.verbose > 0:
                        print('[IterativeImputer] Early stopping criterion '
                              'reached.')
                    break
                Xt_previous = Xt.copy()

You don't need else since you will break in the if branch

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh! I thought you meant that the entire else branch was not needed. Yup, you're right. I'll fix it.

return Xt

imputations_per_round = len(self.imputation_sequence_) // self.n_iter
imps_per_round = len(self.imputation_sequence_) // self.n_iter_
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let the full name, this is more explicit. Split the line into two lines.

verbose=1,
random_state=rng)
X_filled_5 = imputer.fit_transform(X_missing)
assert_allclose(X_filled_100, X_filled_5, atol=1e-7)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can add an assert on n_iter_. The reason is that we were missing (+1) in some estimator when reporting the number of iteration. it could be quite useful.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, what does this mean? I added this:

    imputer = IterativeImputer(max_iter=100,
                               tol=0,
                               sample_posterior=False,
                               verbose=1,
                               random_state=rng)
    imputer.fit(X_missing)
    assert imputer.max_iter == imputer.n_iter_

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it is what I meant. Inverse the assert order. Usually expected come last.
assert imputer.n_iter_ == imputer.max_iter

@sergeyf
Copy link
Contributor Author
sergeyf commented Jan 31, 2019

@jnothman You asked for some convergence plots. I'm plotting inf norm and l2 norm

inf norm is np.linalg.norm(Xt - Xt_previous, ord=np.inf) / np.max(np.abs(X[~mask_missing_values]))

l2 norm is np.linalg.norm(Xt - Xt_previous, ord='fro') / np.linalg.norm(Xt, ord='fro')

Here are two plots. Convergence is quite rapid in both cases, but there are minor improvements in the case of California, while Boston stays flat.

image

image

The code that generates these is below, but you can't run it because I had to define custom class variables.

import numpy as np

from sklearn.datasets import load_boston, fetch_california_housing
from sklearn.impute import IterativeImputer

import matplotlib.pyplot as plt


def plot_convergence(X, dataname=''):
    X_missing = X.copy()
    X_missing[np.random.randn(*X.shape) < 0.5] = np.nan
    
    imputer = IterativeImputer(max_iter=25, tol=0).fit(X_missing)
    
    plt.figure(figsize=(12, 6))
    plt.title('IteratveiImputer Convergence for %s Data' %dataname)
    plt.semilogy(imputer.tol_inf)
    plt.semilogy(imputer.tol_l2)
    plt.legend(['Infinity norm of X_i - X_{i-1}', 'L2 norm of X_i - X_{i-1}'])
    plt.show()

X, y = load_boston(return_X_y=True)
plot_convergence(X, 'Boston')

X, y = fetch_california_housing(return_X_y=True)
plot_convergence(X, 'California Housing')

@sergeyf
Copy link
Contributor Author
sergeyf commented Jan 31, 2019

Also, I know we aren't doing convergence when sample_posterior=True, but I plotted it for that case anyway just to see what's going on. It's weird:
image
image

@jnothman
Copy link
Member
jnothman commented Feb 4, 2019 via email

@sergeyf
Copy link
Contributor Author
sergeyf commented Feb 4, 2019

Great. I'll say this is ready for MRG reviews then.

@sergeyf sergeyf changed the title [WIP] IterativeImputer: n_iter->max_iter [MRG] IterativeImputer: n_iter->max_iter Feb 4, 2019
Copy link
Member
@jnothman jnothman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks very good otherwise!

Maximum number of imputation rounds to perform before returning the
imputations computed during the final round. A round is a single
imputation of each feature with missing values. The stopping criterion
is met once `abs(max(X_i - X_{i-1}))/abs(max(X[known_vals]))` < tol.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you might need to clarify that X_i is X at iteration i. Might be better to use t to index time?

self.n_iter_ = self.max_iter
if not self.sample_posterior:
Xt_previous = np.copy(Xt)
for i_rnd in range(self.max_iter):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idiom I have come to enjoy is:

for self.n_iter_ in range(1, self.max_iter + 1):
    update model

    if stopping condition is met:
        break
else:
    warnings.warng(...)

(We have something similar in label_propagation.py, but there the stopping condition applies at the start of the loop)

@sergeyf
Copy link
Contributor Author
sergeyf commented Feb 6, 2019

@jnothman I've addressed your two comments. @glemaitre perhaps you'll take a second look?

@glemaitre
Copy link
Member

The idiom I have come to enjoy is:

I thought about that pattern and gave up thinking that the else with the for is not so usual. Since that somebody else thought about it, happy to introduce it :)

Copy link
Member
@glemaitre glemaitre left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe th 10000 is comment to others. Learn more.

All good. Just three little tests for the coverage

@@ -879,7 +907,7 @@ def fit_transform(self, X, y=None):
self.initial_imputer_ = None
X, Xt, mask_missing_values = self._initial_imputation(X)

if self.n_iter == 0:
if self.max_iter == 0:
return Xt
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line is not covered by a test (as previously). Could you add this case to the unit test?

axis=None)
if inf_norm < normalized_tol:
if self.verbose > 0:
print('[IterativeImputer] Early stopping criterion '
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We also do not test the testing. Turning to 1 could be enough for a couple of tests. We will ensure that printing does not trigger an error.

@@ -942,10 +985,10 @@ def transform(self, X):

X, Xt, mask_missing_values = self._initial_imputation(X)

if self.n_iter == 0:
if self.n_iter_ == 0:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is also not tested. Could we force the n_iter_ and check that we have X == Xt.

@sergeyf
Copy link
Contributor Author
sergeyf commented Feb 8, 2019

@glemaitre I've added some tests to cover the cases you mentioned.

@glemaitre
Copy link
Member

That's the error:

[00:15:01] ================================== FAILURES ===================================
[00:15:01] _______________________ test_iterative_imputer_verbose ________________________
[00:15:01] 
[00:15:01]     def test_iterative_imputer_verbose():
[00:15:01]         n = 10
[00:15:01]         d = 3
[00:15:01]         X = sparse_random_matrix(n, d, density=0.10).toarray()
[00:15:01]         imputer = IterativeImputer(missing_values=0, max_iter=1, verbose=1)
[00:15:01] >       imputer.fit(X)
[00:15:01] 
[00:15:01] X          = array([[0., 0., 0.],
[00:15:01]        [0., 0., 0.],
[00:15:01]        [0., 0., 0.],
[00:15:01]        [0., 0., 0.],
[00:15:01]        [0., 0., 0.],
[00:15:01]        [0., 0., 0.],
[00:15:01]        [0., 0., 0.],
[00:15:01]        [0., 0., 0.],
[00:15:01]        [0., 0., 0.],
[00:15:01]        [0., 0., 0.]])
[00:15:01] d          = 3
[00:15:01] imputer    = IterativeImputer(imputation_order='ascending', initial_strategy='mean',
[00:15:01]                  max_iter=1, max_value=None, m...earest_features=None, predictor=None, random_state=None,
[00:15:01]                  sample_posterior=False, tol=0.001, verbose=1)
[00:15:01] n          = 10
[00:15:01] 
[00:15:01] c:\python37-x64\lib\site-packages\sklearn\tests\test_impute.py:542: 
[00:15:01] _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
[00:15:01] c:\python37-x64\lib\site-packages\sklearn\impute.py:1032: in fit
[00:15:01]     self.fit_transform(X)
[00:15:01] c:\python37-x64\lib\site-packages\sklearn\impute.py:930: in fit_transform
[00:15:01]     normalized_tol = self.tol * np.max(np.abs(X[~mask_missing_values]))
[00:15:01] c:\python37-x64\lib\site-packages\numpy\core\fromnumeric.py:2505: in amax
[00:15:01]     initial=initial)
[00:15:01] _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
[00:15:01] 
[00:15:01] obj = array([], dtype=float64), ufunc = <ufunc 'maximum'>, method = 'max'
[00:15:01] axis = None, dtype = None, out = None
[00:15:01] kwargs = {'initial': <no value>, 'keepdims': <no value>}, passkwargs = {}
[00:15:01] 
[00:15:01]     def _wrapreduction(obj, ufunc, method, axis, dtype, out, **kwargs):
[00:15:01]         passkwargs = {k: v for k, v in kwargs.items()
[00:15:01]                       if v is not np._NoValue}
[00:15:01]     
[00:15:01]         if type(obj) is not mu.ndarray:
[00:15:01]             try:
[00:15:01]                 reduction = getattr(obj, method)
[00:15:01]             except AttributeError:
[00:15:01]                 pass
[00:15:01]             else:
[00:15:01]                 # This branch is needed for reductions like any which don't
[00:15:01]                 # support a dtype.
[00:15:01]                 if dtype is not None:
[00:15:01]                     return reduction(axis=axis, dtype=dtype, out=out, **passkwargs)
[00:15:01]                 else:
[00:15:01]                     return reduction(axis=axis, out=out, **passkwargs)
[00:15:01]     
[00:15:01] >       return ufunc.reduce(obj, axis, dtype, out, **passkwargs)
[00:15:01] E       ValueError: zero-size array to reduction operation maximum which has no identity
[00:15:01] 
[00:15:01] axis       = None
[00:15:01] dtype      = None
[00:15:01] kwargs     = {'initial': <no value>, 'keepdims': <no value>}
[00:15:01] method     = 'max'
[00:15:01] obj        = array([], dtype=float64)
[00:15:01] out        = None
[00:15:01] passkwargs = {}
[00:15:01] ufunc      = <ufunc 'maximum'>

@sergeyf
Copy link
Contributor Author
sergeyf commented Feb 8, 2019 via email

@sergeyf
Copy link
Contributor Author
sergeyf commented Feb 8, 2019

Fascinating - it accidentally made a test matrix that had no known values. I guess I should add a check for that and return without doing anything because you can't make an inference if you have 0 info?

@sergeyf
Copy link
Contributor Author
sergeyf commented Feb 8, 2019

@glemaitre I fixed that issue and made a few more very small changes to prevent other problems. Plus one more tests.

jnothman and others added 2 commits February 10, 2019 15:49
Co-Authored-By: sergeyf <sergeyfeldman@gmail.com>
Co-Authored-By: sergeyf <sergeyfeldman@gmail.com>
@sergeyf
Copy link
Contributor Author
sergeyf commented Feb 10, 2019

Thanks, applied.

@sergeyf
Copy link
Contributor Author
sergeyf commented Feb 12, 2019

@jnothman @glemaitre I renamed predictor to estimator. Nothing else is happening in the latest commit. I'm still waiting on approval from @glemaitre.

Copy link
Member
@glemaitre glemaitre left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only one nitpick otherwise LGTM

d = 3
X = sparse_random_matrix(n, d, density=0.10).toarray()
X = sparse_random_matrix(n, d, density=0.10, random_state=rng).toarray()
imputer = IterativeImputer(missing_values=0, max_iter=1, verbose=1)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could you put verbose at 2. I think that we have a case that we want verbose > 1

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did this two lines down (545):

imputer.verbose = 2
imputer.transform(X)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nevermind, I'll just do it twice.

@sergeyf
Copy link
Contributor Author
sergeyf commented Feb 12, 2019

@glemaitre Ready to merge!

@jnothman jnothman merged commit 92e7316 into scikit-learn:iterativeimputer Feb 12, 2019
@jnothman
Copy link
Member

Thanks once more @sergeyf

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants
0