[MRG] IterativeImputer: n_iter->max_iter #13061
So that's a problem of stopping too early rather than too late! Hmm... If
it's stopping too early, that basically means that it needs more burn-in
for the imputed values to find some equilibrium? So maybe what you want is
not tol, but something more like n_iter_no_decrease.
|
Haha, now you've got me convinced about I'll try that for now - it's what you had originally: Setting tol as a default of 1e-3 fixes the |
Re: benchmarks. I'll do some tomorrow (it's 9pm here). Re: convergence warning. Can't hurt. I'll find an example and stick it in here. |
Might want to have that max (which is an infty-norm so can maybe use the
scipy norm implementation) divided by the norm of the non-missing values of
the original X to be invariant to scale.
… |
@@ -511,7 +514,7 @@ class IterativeImputer(BaseEstimator, TransformerMixin): | |||
``feat_idx`` is the current feature to be imputed, | |||
``neighbor_feat_idx`` is the array of other features used to impute the | |||
current feature, and ``predictor`` is the trained predictor used for | |||
the imputation. Length is ``self.n_features_with_missing_ * n_iter``. | |||
the imputation. Length is ``self.n_features_with_missing_ * max_iter``. | |||
|
|||
n_features_with_missing_ : int | |||
Number of features with missing values. |
There was a problem hiding this comment.
This should be another list item rather than a note in the above attribute description
sklearn/tests/test_impute.py
Outdated
verbose=1, | ||
random_state=rng) | ||
X_filled_100 = imputer.fit_transform(X_missing) | ||
assert(len(imputer.imputation_sequence_) == d * 5) |
There was a problem hiding this comment.
So that the code is clear you can check n_iter_ directly
Can also check that tol=0 results in n_iter_ == max_iter?
There was a problem hiding this comment.
and remove the outer brackets
sklearn/impute.py
Outdated
the imputation. Length is ``self.n_features_with_missing_ * n_iter``. | ||
the imputation. Length is ``self.n_features_with_missing_ * | ||
self.n_iter_``. ``self.n_iter_`` is the number of iteration rounds that | ||
actually occurred, taking early stopping into account. |
sklearn/impute.py
Outdated
Maximum number of imputation rounds to perform before returning the | ||
imputations computed during the final round. A round is a single | ||
imputation of each feature with missing values. The stopping criterion | ||
is met once abs(max(X_i - X_{i-1})) < tol. Note that early stopping is |
There was a problem hiding this comment.
Put the backticks around the formulation of the stopping criterion
sklearn/impute.py
Outdated
mus_too_high = mus > self._max_value | ||
imputed_values[mus_too_high] = self._max_value | ||
# the rest can be sampled without statistical issues | ||
sample_flag = positive_sigmas & ~mus_too_low & ~mus_too_high |
There was a problem hiding this comment.
I would use the term mask instead of flag. Maybe inrange_mask
sklearn/impute.py
Outdated
self.n_iter_ = self.max_iter | ||
if not self.sample_posterior: | ||
Xt_previous = np.copy(Xt) | ||
for i_rnd in range(self.max_iter): |
There was a problem hiding this comment.
I would prefer a while
loop
while criterion > self.tol or self.n_iter_ < self.max_iter
The reasoning is that we already see that we will have an early stopping mechanism
There was a problem hiding this comment.
You also need if not self.sample_posterior:
so it gets messier. I would prefer to leave as is.
There was a problem hiding this comment.
The idiom I have come to enjoy is:
for self.n_iter_ in range(1, self.max_iter + 1):
update model
if stopping condition is met:
break
else:
warnings.warng(...)
(We have something similar in label_propagation.py, but there the stopping condition applies at the start of the loop)
sklearn/impute.py
Outdated
for i_rnd in range(self.n_iter): | ||
self.n_iter_ = self.max_iter | ||
if not self.sample_posterior: | ||
Xt_previous = np.copy(Xt) |
sklearn/impute.py
Outdated
print('[IterativeImputer] Early stopping criterion ' | ||
'reached.') | ||
break | ||
else: |
There was a problem hiding this comment.
I don't think that's true. I need to specify that Xt_previous = Xt.copy()
otherwise the subsequent diff will be 0. Try it out.
There was a problem hiding this comment.
what I mean is
if not self.sample_posterior:
inf_norm = np.linalg.norm(Xt - Xt_previous, ord=np.inf,
axis=None)
if inf_norm < normalized_tol:
self.n_iter_ = i_rnd + 1
if self.verbose > 0:
print('[IterativeImputer] Early stopping criterion '
'reached.')
break
Xt_previous = Xt.copy()
You don't need else since you will brea 10000 k in the if branch
There was a problem hiding this comment.
Oh! I thought you meant that the entire else branch was not needed. Yup, you're right. I'll fix it.
sklearn/impute.py
Outdated
return Xt | ||
|
||
imputations_per_round = len(self.imputation_sequence_) // self.n_iter | ||
imps_per_round = len(self.imputation_sequence_) // self.n_iter_ |
There was a problem hiding this comment.
Let the full name, this is more explicit. Split the line into two lines.
sklearn/tests/test_impute.py
Outdated
verbose=1, | ||
random_state=rng) | ||
X_filled_5 = imputer.fit_transform(X_missing) | ||
assert_allclose(X_filled_100, X_filled_5, atol=1e-7) |
There was a problem hiding this comment.
you can add an assert on n_iter_
. The reason is that we were missing (+1) in some estimator when reporting the number of iteration. it could be quite useful.
There was a problem hiding this comment.
Sorry, what does this mean? I added this:
imputer = IterativeImputer(max_iter=100,
tol=0,
sample_posterior=False,
verbose=1,
random_state=rng)
imputer.fit(X_missing)
assert imputer.max_iter == imputer.n_iter_
There was a problem hiding this comment.
10000 div>Yes, it is what I meant. Inverse the assert order. Usually expected come last.
assert imputer.n_iter_ == imputer.max_iter
@jnothman You asked for some convergence plots. I'm plotting inf norm and l2 norm inf norm is l2 norm is Here are two plots. Convergence is quite rapid in both cases, but there are minor improvements in the case of California, while Boston stays flat. The code that generates these is below, but you can't run it because I had to define custom class variables.
|
I don't think it's so weird with sample_posterior... it's maybe even
converging on a reasonably stable distribution.
|
Great. I'll say this is ready for MRG reviews then. |
sklearn/impute.py
Outdated
Maximum number of imputation rounds to perform before returning the | ||
imputations computed during the final round. A round is a single | ||
imputation of each feature with missing values. The stopping criterion | ||
is met once `abs(max(X_i - X_{i-1}))/abs(max(X[known_vals]))` < tol. |
There was a problem hiding this comment.
you might need to clarify that X_i is X at iteration i. Might be better to use t to index time?
sklearn/impute.py
Outdated
self.n_iter_ = self.max_iter | ||
if not self.sample_posterior: | ||
Xt_previous = np.copy(Xt) | ||
for i_rnd in range(self.max_iter): |
There was a problem hiding this comment.
The idiom I have come to enjoy is:
for self.n_iter_ in range(1, self.max_iter + 1):
update model
if stopping condition is met:
break
else:
warnings.warng(...)
(We have something similar in label_propagation.py, but there the stopping condition applies at the start of the loop)
@jnothman I've addressed your two comments. @glemaitre perhaps you'll take a second look? |
I thought about that pattern and gave up thinking that the |
@@ -879,7 +907,7 @@ def fit_transform(self, X, y=None): | |||
self.initial_imputer_ = None | |||
X, Xt, mask_missing_values = self._initial_imputation(X) | |||
|
|||
if self.n_iter == 0: | |||
if self.max_iter == 0: | |||
return Xt |
There was a problem hiding this comment.
This line is not covered by a test (as previously). Could you add this case to the unit test?
axis=None) | ||
if inf_norm < normalized_tol: | ||
if self.verbose > 0: | ||
print('[IterativeImputer] Early stopping criterion ' |
There was a problem hiding this comment.
We also do not test the testing. Turning to 1
could be enough for a couple of tests. We will ensure that printing does not trigger an error.
sklearn/impute.py
Outdated
@@ -942,10 +985,10 @@ def transform(self, X): | |||
|
|||
X, Xt, mask_missing_values = self._initial_imputation(X) | |||
|
|||
if self.n_iter == 0: | |||
if self.n_iter_ == 0: |
There was a problem hiding this comment.
This is also not tested. Could we force the n_iter_
and check that we have X == Xt
.
@glemaitre I've added some tests to cover the cases you mentioned. |
That's the error:
|
Weird this passed locally. Will investigate...
…On Fri, Feb 8, 2019, 10:36 AM Guillaume Lemaitre ***@***.*** wrote:
That's the error:
[00:15:01] ================================== FAILURES ===================================
[00:15:01] _______________________ test_iterative_imputer_verbose ________________________
[00:15:01]
[00:15:01] def test_iterative_imputer_verbose():
[00:15:01] n = 10
[00:15:01] d = 3
[00:15:01] X = sparse_random_matrix(n, d, density=0.10).toarray()
[00:15:01] imputer = IterativeImputer(missing_values=0, max_iter=1, verbose=1)
[00:15:01] > imputer.fit(X)
[00:15:01]
[00:15:01] X = array([[0., 0., 0.],
[00:15:01] [0., 0., 0.],
[00:15:01] [0., 0., 0.],
[00:15:01] [0., 0., 0.],
[00:15:01] [0., 0., 0.],
[00:15:01] [0., 0., 0.],
[00:15:01] [0., 0., 0.],
[00:15:01] [0., 0., 0.],
[00:15:01] [0., 0., 0.],
[00:15:01] [0., 0., 0.]])
[00:15:01] d = 3
[00:15:01] imputer = IterativeImputer(imputation_order='ascending', initial_strategy='mean',
[00:15:01] max_iter=1, max_value=None, m...earest_features=None, predictor=None, random_state=None,
[00:15:01] sample_posterior=False, tol=0.001, verbose=1)
[00:15:01] n = 10
[00:15:01]
[00:15:01] c:\python37-x64\lib\site-packages\sklearn\tests\test_impute.py:542:
[00:15:01] _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
[00:15:01] c:\python37-x64\lib\site-packages\sklearn\impute.py:1032: in fit
[00:15:01] self.fit_transform(X)
[00:15:01] c:\python37-x64\lib\site-packages\sklearn\impute.py:930: in fit_transform
[00:15:01] normalized_tol = self.tol * np.max(np.abs(X[~mask_missing_values]))
[00:15:01] c:\python37-x64\lib\site-packages\numpy\core\fromnumeric.py:2505: in amax
[00:15:01] initial=initial)
[00:15:01] _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
[00:15:01]
[00:15:01] obj = array([], dtype=float64), ufunc = <ufunc 'maximum'>, method = 'max'
[00:15:01] axis = None, dtype = None, out = None
[00:15:01] kwargs = {'initial': <no value>, 'keepdims': <no value>}, passkwargs = {}
[00:15:01]
[00:15:01] def _wrapreduction(obj, ufunc, method, axis, dtype, out, **kwargs):
[00:15:01] passkwargs = {k: v for k, v in kwargs.items()
[00:15:01] if v is not np._NoValue}
[00:15:01]
[00:15:01] if type(obj) is not mu.ndarray:
[00:15:01] try:
[00:15:01] reduction = getattr(obj, method)
[00:15:01] except AttributeError:
[00:15:01] pass
[00:15:01] else:
[00:15:01] # This branch is needed for reductions like any which don't
[00:15:01] # support a dtype.
[00:15:01] if dtype is not None:
[00:15:01] return reduction(axis=axis, dtype=dtype, out=out, **passkwargs)
[00:15:01] else:
[00:15:01] return reduction(axis=axis, out=out, **passkwargs)
[00:15:01]
[00:15:01] > return ufunc.reduce(obj, axis, dtype, out, **passkwargs)
[00:15:01] E ValueError: zero-size array to reduction operation maximum which has no identity
[00:15:01]
[00:15:01] axis = None
[00:15:01] dtype = None
[00:15:01] kwargs = {'initial': <no value>, 'keepdims': <no value>}
[00:15:01] method = 'max'
[00:15:01] obj = array([], dtype=float64)
[00:15:01] out = None
[00:15:01] passkwargs = {}
[00:15:01] ufunc = <ufunc 'maximum'>
—
You are receiving this because you modified the open/close state.
Reply to this email directly, view it on GitHub
<#13061 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ABya7GOGy4DbuqvVi5mzaFQ8ReoXEfPvks5vLcO2gaJpZM4aWqTq>
.
|
Fascinating - it accidentally made a test matrix that had no known values. I guess I should add a check for that and return without doing anything because you can't make an inference if you have 0 info? |
@glemaitre I fixed that issue and made a few more very small changes to prevent other problems. Plus one more tests. |
Co-Authored-By: sergeyf <sergeyfeldman@gmail.com>
Co-Authored-By: sergeyf <sergeyfeldman@gmail.com>
Thanks, applied. |
@jnothman @glemaitre I renamed |
d = 3 | ||
X = sparse_random_matrix(n, d, density=0.10).toarray() | ||
X = sparse_random_matrix(n, d, density=0.10, random_state=rng).toarray() | ||
imputer = IterativeImputer(missing_values=0, max_iter=1, verbose=1) |
There was a problem hiding this comment.
could you put verbose at 2. I think that we have a case that we want verbose > 1
There was a problem hiding this comment.
I did this two lines down (545):
imputer.verbose = 2
imputer.transform(X)
There was a problem hiding this comment.
Nevermind, I'll just do it twice.
@glemaitre Ready to merge! |
Thanks once more @sergeyf |
This PR addresses a number of discussion, most recently in #11977.
The main purpose of this PR is to provide automatic early stopping. We are going to be using the early stopping rule that is used in the missForest R package:
This is a 8000 sensible criterion, and has the added benefit of not needing to specify a
tol
.Note that this criterion is not applied when
sample_posterior=True
because there is no steady state.This PR does a few more things:
sample_posterior=True
usingtruncnormal
to sample from the posterior.