DOC Give local recommendations about IterativeImputer in docstrings #23701

aperezlebel · 2022-06-20T16:06:15Z

Reference Issues/PRs

Addresses first task of #21967.

What does this implement/fix? Explain your changes.

Adapt the docstrings to give local recommendations for IterativeImputer.

Any other comments?

I refer to KNNImputer as a "multivariate imputer" in the "see also" part because it uses observed features to impute the missing ones (and does not use only the missing ones as does the univariate SimpleImputer). However, in the user guide it is outside of the "multivariate imputation" part. Is it ok to refer to KNNImputer as a multivariate imputer as I did?

glemaitre · 2022-06-21T09:02:52Z

I assume that we can solve the SyntaxError: invalid escape sequence \m using a raw string:

r"""Multivariate ....
"""

sklearn/impute/_iterative.py

glemaitre · 2022-06-21T09:07:37Z

Is it ok to refer to KNNImputer as a multivariate imputer as I did?

Yes, I think this is indeed a multivariate imputer. We should be updating the user guide accordingly in another PR.

Co-authored-by: Guillaume Lemaitre <g.lemaitre58@gmail.com>

aperezlebel · 2022-06-21T12:02:14Z

We should be updating the user guide accordingly in another PR.

Ok, I will do this in task 6 of #21967.

For the record, I derived the complexity as follows: $\mathcal{O}(np\min(n,p))$ for a single fit of BayesianRidge, which is fitted $\mathcal{O}(kp^2)$ times inside IterativeImputer.

sklearn/impute/_iterative.py

glemaitre · 2022-06-23T14:29:00Z

Trying to solve the doc issue. It seems weird to me. Let see if the warning is still up.
We could ass the information about the missing indicator as well.

sklearn/impute/_iterative.py

glemaitre · 2022-06-23T15:04:01Z

I just escaped the backslashes because I don't get where the warning is raised from.

jjerphan · 2022-06-23T19:45:23Z

@ArturoAmorQ: I think you might be interested in reviewing this PR. 🙂

ArturoAmorQ

Thanks for the PR, @aperezlebel, I am looking forward for the rest of the related PRs. For the moment here are a couple of comments.

ArturoAmorQ · 2022-06-24T09:48:47Z

sklearn/impute/_knn.py

+    IterativeImputer : Multivariate imputer that estimates values to impute for
+        each feature with missing values from all the others.


For the sake of consistency, please make the same change in the first line of the IterativeImputer docstring and other references to it in sklearn/impute/_base.py

Since it does not fit on a single line, it will not be compliant with numpydoc. I think that we can let it as-is

ArturoAmorQ · 2022-06-24T09:56:37Z

sklearn/impute/_iterative.py

+    where :math:`k` = `max_iter`, :math:`n` the number of samples and
+    :math:`p` the number of features. It thus becomes prohibitively costly when
+    the number of features increases. Setting
+    `n_nearest_features << n_features`, `skip_complete=True` or increasing `tol`


I noticed that

`n_nearest_features` << `n_features`, `skip_complete` = `True`

was changed to

`n_nearest_features << n_features`, `skip_complete=True`

in f5a1ebd (#23701). I agree that skip_complete=True is the correct markdown as the = operator is part of the user syntax, but I am wondering if the notation n_nearest_features << n_features is more correct for such a mathematical (and not user) syntax. What do you think?

I think that this is fine. We are not super consistent in this regard. At least, the HTML rendering should be on the same line then.

glemaitre · 2022-07-27T13:17:39Z

Thanks @aperezlebel LGTM

…cikit-learn#23701) Co-authored-by: Guillaume Lemaitre <g.lemaitre58@gmail.com>

…23701) Co-authored-by: Guillaume Lemaitre <g.lemaitre58@gmail.com>

aperezlebel added 4 commits June 20, 2022 17:39

Add KNNImputer in "see also" part

24ef7f9

Update "see also" part for consistency between Iterative and KNN imputer

730466e

Give complexity and recommendations for IterativeImputer

7d57404

Merge remote-tracking branch 'upstream/main' into mv_docstrings

6dd2cc2

github-actions bot added the module:impute label Jun 20, 2022

glemaitre changed the title ~~Give local recommendations about IterativeImputer in docstrings~~ DOC Give local recommendations about IterativeImputer in docstrings Jun 21, 2022

github-actions bot added the Documentation label Jun 21, 2022

glemaitre reviewed Jun 21, 2022

View reviewed changes

sklearn/impute/_iterative.py Outdated Show resolved Hide resolved

glemaitre reviewed Jun 21, 2022

View reviewed changes

sklearn/impute/_iterative.py Outdated Show resolved Hide resolved

aperezlebel and others added 3 commits June 21, 2022 13:39

Update sklearn/impute/_iterative.py

f5a1ebd

Co-authored-by: Guillaume Lemaitre <g.lemaitre58@gmail.com>

Use raw string to solve failing test

9cb6153

Move complexity comment to Notes section

d18eba6

aperezlebel mentioned this pull request Jun 21, 2022

Documenting missing-values practices #21967

Open

7 tasks

glemaitre reviewed Jun 23, 2022

View reviewed changes

sklearn/impute/_iterative.py Outdated Show resolved Hide resolved

glemaitre added 2 commits June 23, 2022 16:28

Update sklearn/impute/_iterative.py

b67f81b

Merge branch 'main' into mv_docstrings

862ce63

glemaitre self-requested a review June 23, 2022 14:44

glemaitre reviewed Jun 23, 2022

View reviewed changes

sklearn/impute/_iterative.py Outdated Show resolved Hide resolved

sklearn/impute/_iterative.py Outdated Show resolved Hide resolved

Apply suggestions from code review

f47ada3

Update "see also" part to be consistent with scikit-learn#23714

001ff6c

ArturoAmorQ reviewed Jun 24, 2022

View reviewed changes

glemaitre self-requested a review July 27, 2022 12:44

glemaitre merged commit 8a01010 into scikit-learn:main Jul 27, 2022

glemaitre added a commit to glemaitre/scikit-learn that referenced this pull request Aug 4, 2022

DOC Give local recommendations about IterativeImputer in docstrings (s…

f6fc92b

…cikit-learn#23701) Co-authored-by: Guillaume Lemaitre <g.lemaitre58@gmail.com>

glemaitre added a commit that referenced this pull request Aug 5, 2022

DOC Give local recommendations about IterativeImputer in docstrings (#…

67044ce

…23701) Co-authored-by: Guillaume Lemaitre <g.lemaitre58@gmail.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

DOC Give local recommendations about IterativeImputer in docstrings #23701

DOC Give local recommendations about IterativeImputer in docstrings #23701

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

		IterativeImputer : Multivariate imputer that estimates values to impute for
		each feature with missing values from all the others.

Uh oh!

DOC Give local recommendations about IterativeImputer in docstrings #23701

DOC Give local recommendations about IterativeImputer in docstrings #23701

Uh oh!

Conversation

Reference Issues/PRs

What does this implement/fix? Explain your changes.

Any other comments?

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!