-
-
Notifications
You must be signed in to change notification settings - Fork 25.8k
DOC Give local recommendations about IterativeImputer in docstrings #23701
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
I assume that we can solve the r"""Multivariate ....
""" |
Yes, I think this is indeed a multivariate imputer. We should be updating the user guide accordingly in another PR. |
Co-authored-by: Guillaume Lemaitre <g.lemaitre58@gmail.com>
Ok, I will do this in task 6 of #21967. For the record, I derived the complexity as follows: |
Trying to solve the doc issue. It seems weird to me. Let see if the warning is still up. |
I just escaped the backslashes because I don't get where the warning is raised from. |
@ArturoAmorQ: I think you might be interested in reviewing this PR. 🙂 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR, @aperezlebel, I am looking forward for the rest of the related PRs. For the moment here are a couple of comments.
IterativeImputer : Multivariate imputer that estimates values to impute for | ||
each feature with missing values from all the others. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the sake of consistency, please make the same change in the first line of the IterativeImputer
docstring and other references to it in sklearn/impute/_base.py
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since it does not fit on a single line, it will not be compliant with numpydoc
. I think that we can let it as-is
where :math:`k` = `max_iter`, :math:`n` the number of samples and | ||
:math:`p` the number of features. It thus becomes prohibitively costly when | ||
the number of features increases. Setting | ||
`n_nearest_features << n_features`, `skip_complete=True` or increasing `tol` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I noticed that
`n_nearest_features` << `n_features`, `skip_complete` = `True`
was changed to
`n_nearest_features << n_features`, `skip_complete=True`
in f5a1ebd
(#23701). I agree that skip_complete=True
is the correct markdown as the =
operator is part of the user syntax, but I am wondering if the notation n_nearest_features
<< n_features
is more correct for such a mathematical (and not user) syntax. What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that this is fine. We are not super consistent in this regard. At least, the HTML rendering should be on the same line then.
Thanks @aperezlebel LGTM |
…cikit-learn#23701) Co-authored-by: Guillaume Lemaitre <g.lemaitre58@gmail.com>
…23701) Co-authored-by: Guillaume Lemaitre <g.lemaitre58@gmail.com>
Reference Issues/PRs
Addresses first task of #21967.
What does this implement/fix? Explain your changes.
Adapt the docstrings to give local recommendations for IterativeImputer.
Any other comments?
I refer to KNNImputer as a "multivariate imputer" in the "see also" part because it uses observed features to impute the missing ones (and does not use only the missing ones as does the univariate SimpleImputer). However, in the user guide it is outside of the "multivariate imputation" part. Is it ok to refer to KNNImputer as a multivariate imputer as I did?