8000 add_indicator switch in imputers · Issue #11886 · scikit-learn/scikit-learn · GitHub
[go: up one dir, main page]

Skip to content

add_indicator switch in imputers #11886

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
jnothman opened this issue Aug 21, 2018 · 6 comments · Fixed by #12583
Closed

add_indicator switch in imputers #11886

jnothman opened this issue Aug 21, 2018 · 6 comments · Fixed by #12583
Labels
Easy Well-defined and straightforward way to resolve Enhancement
Milestone

Comments

@jnothman
Copy link
Member

For whatever imputers we have, but especially SimpleImputer, we should have an add_indicator parameter, which simply stacks a MissingIndicator transform onto the output of the imputer's transform.

@jnothman jnothman added Easy Well-defined and straightforward way to resolve Enhancement help wanted labels Aug 21, 2018
@jnothman
Copy link
Member Author

This allows downstream models to adjust for the fact that a value was imputed, rather than observed.

@amueller amueller added this to the 0.21 milestone Aug 22, 2018
@prathusha94
Copy link

Can I take this up if no one else is working on it yet @jnothman ?

@jnothman
Copy link
Member Author

Go for it

@datajanko
Copy link
Contributor

@prathusha94 are you still working on this?

@jnothman
Copy link
Member Author
jnothman commented Feb 12, 2019

I've realised that if we add a parameter append to MissingIndicator which stacks the indicator matrix into X, its output can simply be piped into any imputer. It will potentially enhance iterative imputation, too, but could affect KNN imputation in arbitrary ways.

@jnothman
Copy link
Member Author

@sergeyf points out that for IterativeImputer, stacking an indicator before passing X in will lead the estimator to learn bad imputation functions, so... maybe that's not such a good idea.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
4B8E
Labels
Easy Well-defined and straightforward way to resolve Enhancement
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants
0