-
-
Notifications
You must be signed in to change notification settings - Fork 26k
Exchanging Boston for california dataset in plot missing values #16513
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Exchanging Boston for california dataset in plot missing values #16513
Conversation
…into boston_plot_missing_values
…into boston_plot_missing_values
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe the first paragraph would mention the list of strategies investigated in this example:
- imputation by the constant 0 value
- imputation by the mean value of each feature combined with a missing-ness indicator auxiliary variable
- k nearest neighbor imputation
- iterative imputation
But otherwise LGTM. Thanks very much.
The https://85-171681175-gh.circle-artifacts.com/0/doc/auto_examples/impute/plot_missing_values.html |
Co-Authored-By: Olivier Grisel <olivier.grisel@ensta.org>
…into boston_plot_missing_values
Co-Authored-By: Lucy Liu <jliu176@gmail.com>
Co-Authored-By: Lucy Liu <jliu176@gmail.com>
Co-Authored-By: Lucy Liu <jliu176@gmail.com>
Co-Authored-By: Lucy Liu <jliu176@gmail.com>
Co-Authored-By: Lucy Liu <jliu176@gmail.com>
Co-Authored-By: Lucy Liu <jliu176@gmail.com>
Co-Authored-By: Lucy Liu <jliu176@gmail.com>
…into boston_plot_missing_values
@scikit-learn/core-devs time to merge this one? Thanks! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In a follow up PR, it would be nice to rename get_scores_for_imputer
into get_mse_for_imputer
and do the negation in the function. This way, we would not need to run:
mses_diabetes = mses_diabetes * -1
mses_california = mses_california * -1
before plotting, which can be a little confusing.
Co-Authored-By: Thomas J Fan <thomasjpfan@gmail.com>
Co-Authored-By: Thomas J Fan <thomasjpfan@gmail.com>
Co-Authored-By: Thomas J Fan <thomasjpfan@gmail.com>
Co-Authored-By: Thomas J Fan <thomasjpfan@gmail.com>
…into boston_plot_missing_values
…into boston_plot_missing_values
…16513) * first few comments * added new california dataset * removed boston dataset from the file * updating the DOCs * adding a DOC for calculating the error * exchanged the order started writing functions on scoring the imputers * finished writing functions for imputers * finished writing functions and started on DOcs * working on the DOCs for imputers * cleaning up * flake8 * cleaning up * cleaning up * restructuring the document * further text restructuring * text restructuring * flake8 * reformatting * flake8 * Update examples/impute/plot_missing_values.py Co-Authored-By: Olivier Grisel <olivier.grisel@ensta.org> * updated the intro * improve bullet point rendering * spelling * Update examples/impute/plot_missing_values.py Co-Authored-By: Alexandre Gramfort <alexandre.gramfort@m4x.org> * Update examples/impute/plot_missing_values.py Co-Authored-By: Alexandre Gramfort <alexandre.gramfort@m4x.org> * Update examples/impute/plot_missing_values.py Co-Authored-By: Alexandre Gramfort <alexandre.gramfort@m4x.org> * Update examples/impute/plot_missing_values.py Co-Authored-By: Alexandre Gramfort <alexandre.gramfort@m4x.org> * Update examples/impute/plot_missing_values.py Co-Authored-By: Alexandre Gramfort <alexandre.gramfort@m4x.org> * Update examples/impute/plot_missing_values.py Co-Authored-By: Alexandre Gramfort <alexandre.gramfort@m4x.org> * changed the naming * restructuring text * Update examples/impute/plot_missing_values.py Co-Authored-By: Alexandre Gramfort <alexandre.gramfort@m4x.org> * flake8 * Update examples/impute/plot_missing_values.py Co-Authored-By: Alexandre Gramfort <alexandre.gramfort@m4x.org> * Update examples/impute/plot_missing_values.py Co-Authored-By: Alexandre Gramfort <alexandre.gramfort@m4x.org> * Update examples/impute/plot_missing_values.py Co-Authored-By: Alexandre Gramfort <alexandre.gramfort@m4x.org> * Update examples/impute/plot_missing_values.py Co-Authored-By: Alexandre Gramfort <alexandre.gramfort@m4x.org> * Update examples/impute/plot_missing_values.py Co-Authored-By: Alexandre Gramfort <alexandre.gramfort@m4x.org> * changing missing values from 0 to nan * Update examples/impute/plot_missing_values.py Co-Authored-By: Alexandre Gramfort <alexandre.gramfort@m4x.org> * REGRESSOR to regressor * Update examples/impute/plot_missing_values.py Co-Authored-By: Lucy Liu <jliu176@gmail.com> * Update examples/impute/plot_missing_values.py Co-Authored-By: Lucy Liu <jliu176@gmail.com> * Update examples/impute/plot_missing_values.py Co-Authored-By: Lucy Liu <jliu176@gmail.com> * Update examples/impute/plot_missing_values.py Co-Authored-By: Lucy Liu <jliu176@gmail.com> * Update examples/impute/plot_missing_values.py Co-Authored-By: Lucy Liu <jliu176@gmail.com> * Update examples/impute/plot_missing_values.py Co-Authored-By: Lucy Liu <jliu176@gmail.com> * Update examples/impute/plot_missing_values.py Co-Authored-By: Lucy Liu <jliu176@gmail.com> * Update examples/impute/plot_missing_values.py Co-Authored-By: Lucy Liu <jliu176@gmail.com> * Update examples/impute/plot_missing_values.py Co-Authored-By: Lucy Liu <jliu176@gmail.com> * Update examples/impute/plot_missing_values.py Co-Authored-By: Lucy Liu <jliu176@gmail.com> * Update examples/impute/plot_missing_values.py Co-Authored-By: Lucy Liu <jliu176@gmail.com> * Update examples/impute/plot_missing_values.py Co-Authored-By: Lucy Liu <jliu176@gmail.com> * flake8 * Update examples/impute/plot_missing_values.py Co-Authored-By: Thomas J Fan <thomasjpfan@gmail.com> * Update examples/impute/plot_missing_values.py Co-Authored-By: Thomas J Fan <thomasjpfan@gmail.com> * Update examples/impute/plot_missing_values.py Co-Authored-By: Thomas J Fan <thomasjpfan@gmail.com> * Update examples/impute/plot_missing_values.py Co-Authored-By: Thomas J Fan <thomasjpfan@gmail.com> * flake8 * reducting number of samples used from california dataset * CLN Removes the need for MissingIndicator * FIX Unrelated bug but is stopping the CI from passing Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org> Co-authored-by: Alexandre Gramfort <alexandre.gramfort@m4x.org> Co-authored-by: Lucy Liu <jliu176@gmail.com> Co-authored-by: Thomas J Fan <thomasjpfan@gmail.com>
…cikit-learn#16513) * first few comments * added new california dataset * removed boston dataset from the file * updating the DOCs * adding a DOC for calculating the error * exchanged the order started writing functions on scoring the imputers * finished writing functions for imputers * finished writing functions and started on DOcs * working on the DOCs for imputers * cleaning up * flake8 * cleaning up * cleaning up * restructuring the document * further text restructuring * text restructuring * flake8 * reformatting * flake8 * Update examples/impute/plot_missing_values.py Co-Authored-By: Olivier Grisel <olivier.grisel@ensta.org> * updated the intro * improve bullet point rendering * spelling * Update examples/impute/plot_missing_values.py Co-Authored-By: Alexandre Gramfort <alexandre.gramfort@m4x.org> * Update examples/impute/plot_missing_values.py Co-Authored-By: Alexandre Gramfort <alexandre.gramfort@m4x.org> * Update examples/impute/plot_missing_values.py Co-Authored-By: Alexandre Gramfort <alexandre.gramfort@m4x.org> * Update examples/impute/plot_missing_values.py Co-Authored-By: Alexandre Gramfort <alexandre.gramfort@m4x.org> * Update examples/impute/plot_missing_values.py Co-Authored-By: Alexandre Gramfort <alexandre.gramfort@m4x.org> * Update examples/impute/plot_missing_values.py Co-Authored-By: Alexandre Gramfort <alexandre.gramfort@m4x.org> * changed the naming * restructuring text * Update examples/impute/plot_missing_values.py Co-Authored-By: Alexandre Gramfort <alexandre.gramfort@m4x.org> * flake8 * Update examples/impute/plot_missing_values.py Co-Authored-By: Alexandre Gramfort <alexandre.gramfort@m4x.org> * Update examples/impute/plot_missing_values.py Co-Authored-By: Alexandre Gramfort <alexandre.gramfort@m4x.org> * Update examples/impute/plot_missing_values.py Co-Authored-By: Alexandre Gramfort <alexandre.gramfort@m4x.org> * Update examples/impute/plot_missing_values.py Co-Authored-By: Alexandre Gramfort <alexandre.gramfort@m4x.org> * Update examples/impute/plot_missing_values.py Co-Authored-By: Alexandre Gramfort <alexandre.gramfort@m4x.org> * changing missing values from 0 to nan * Update examples/impute/plot_missing_values.py Co-Authored-By: Alexandre Gramfort <alexandre.gramfort@m4x.org> * REGRESSOR to regressor * Update examples/impute/plot_missing_values.py Co-Authored-By: Lucy Liu <jliu176@gmail.com> * Update examples/impute/plot_missing_values.py Co-Authored-By: Lucy Liu <jliu176@gmail.com> * Update examples/impute/plot_missing_values.py Co-Authored-By: Lucy Liu <jliu176@gmail.com> * Update examples/impute/plot_missing_values.py Co-Authored-By: Lucy Liu <jliu176@gmail.com> * Update examples/impute/plot_missing_values.py Co-Authored-By: Lucy Liu <jliu176@gmail.com> * Update examples/impute/plot_missing_values.py Co-Authored-By: Lucy Liu <jliu176@gmail.com> * Update examples/impute/plot_missing_values.py Co-Authored-By: Lucy Liu <jliu176@gmail.com> * Update examples/impute/plot_missing_values.py Co-Authored-By: Lucy Liu <jliu176@gmail.com> * Update examples/impute/plot_missing_values.py Co-Authored-By: Lucy Liu <jliu176@gmail.com> * Update examples/impute/plot_missing_values.py Co-Authored-By: Lucy Liu <jliu176@gmail.com> * Update examples/impute/plot_missing_values.py Co-Authored-By: Lucy Liu <jliu176@gmail.com> * Update examples/impute/plot_missing_values.py Co-Authored-By: Lucy Liu <jliu176@gmail.com> * flake8 * Update examples/impute/plot_missing_values.py Co-Authored-By: Thomas J Fan <thomasjpfan@gmail.com> * Update examples/impute/plot_missing_values.py Co-Authored-By: Thomas J Fan <thomasjpfan@gmail.com> * Update examples/impute/plot_missing_values.py Co-Authored-By: Thomas J Fan <thomasjpfan@gmail.com> * Update examples/impute/plot_missing_values.py Co-Authored-By: Thomas J Fan <thomasjpfan@gmail.com> * flake8 * reducting number of samples used from california dataset * CLN Removes the need for MissingIndicator * FIX Unrelated bug but is stopping the CI from passing Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org> Co-authored-by: Alexandre Gramfort <alexandre.gramfort@m4x.org> Co-authored-by: Lucy Liu <jliu176@gmail.com> Co-authored-by: Thomas J Fan <thomasjpfan@gmail.com>
…cikit-learn#16513) * first few comments * added new california dataset * removed boston dataset from the file * updating the DOCs * adding a DOC for calculating the error * exchanged the order started writing functions on scoring the imputers * finished writing functions for imputers * finished writing functions and started on DOcs * working on the DOCs for imputers * cleaning up * flake8 * cleaning up * cleaning up * restructuring the document * further text restructuring * text restructuring * flake8 * reformatting * flake8 * Update examples/impute/plot_missing_values.py Co-Authored-By: Olivier Grisel <olivier.grisel@ensta.org> * updated the intro * improve bullet point rendering * spelling * Update examples/impute/plot_missing_values.py Co-Authored-By: Alexandre Gramfort <alexandre.gramfort@m4x.org> * Update examples/impute/plot_missing_values.py Co-Authored-By: Alexandre Gramfort <alexandre.gramfort@m4x.org> * Update examples/impute/plot_missing_values.py Co-Authored-By: Alexandre Gramfort <alexandre.gramfort@m4x.org> * Update examples/impute/plot_missing_values.py Co-Authored-By: Alexandre Gramfort <alexandre.gramfort@m4x.org> * Update examples/impute/plot_missing_values.py Co-Authored-By: Alexandre Gramfort <alexandre.gramfort@m4x.org> * Update examples/impute/plot_missing_values.py Co-Authored-By: Alexandre Gramfort <alexandre.gramfort@m4x.org> * changed the naming * restructuring text * Update examples/impute/plot_missing_values.py Co-Authored-By: Alexandre Gramfort <alexandre.gramfort@m4x.org> * flake8 * Update examples/impute/plot_missing_values.py Co-Authored-By: Alexandre Gramfort <alexandre.gramfort@m4x.org> * Update examples/impute/plot_missing_values.py Co-Authored-By: Alexandre Gramfort <alexandre.gramfort@m4x.org> * Update examples/impute/plot_missing_values.py Co-Authored-By: Alexandre Gramfort <alexandre.gramfort@m4x.org> * Update examples/impute/plot_missing_values.py Co-Authored-By: Alexandre Gramfort <alexandre.gramfort@m4x.org> * Update examples/impute/plot_missing_values.py Co-Authored-By: Alexandre Gramfort <alexandre.gramfort@m4x.org> * changing missing values from 0 to nan * Update examples/impute/plot_missing_values.py Co-Authored-By: Alexandre Gramfort <alexandre.gramfort@m4x.org> * REGRESSOR to regressor * Update examples/impute/plot_missing_values.py Co-Authored-By: Lucy Liu <jliu176@gmail.com> * Update examples/impute/plot_missing_values.py Co-Authored-By: Lucy Liu <jliu176@gmail.com> * Update examples/impute/plot_missing_values.py Co-Authored-By: Lucy Liu <jliu176@gmail.com> * Update examples/impute/plot_missing_values.py Co-Authored-By: Lucy Liu <jliu176@gmail.com> * Update examples/impute/plot_missing_values.py Co-Authored-By: Lucy Liu <jliu176@gmail.com> * Update examples/impute/plot_missing_values.py Co-Authored-By: Lucy Liu <jliu176@gmail.com> * Update examples/impute/plot_missing_values.py Co-Authored-By: Lucy Liu <jliu176@gmail.com> * Update examples/impute/plot_missing_values.py Co-Authored-By: Lucy Liu <jliu176@gmail.com> * Update examples/impute/plot_missing_values.py Co-Authored-By: Lucy Liu <jliu176@gmail.com> * Update examples/impute/plot_missing_values.py Co-Authored-By: Lucy Liu <jliu176@gmail.com> * Update examples/impute/plot_missing_values.py Co-Authored-By: Lucy Liu <jliu176@gmail.com> * Update examples/impute/plot_missing_values.py Co-Authored-By: Lucy Liu <jliu176@gmail.com> * flake8 * Update examples/impute/plot_missing_values.py Co-Authored-By: Thomas J Fan <thomasjpfan@gmail.com> * Update examples/impute/plot_missing_values.py Co-Authored-By: Thomas J Fan <thomasjpfan@gmail.com> * Update examples/impute/plot_missing_values.py Co-Authored-By: Thomas J Fan <thomasjpfan@gmail.com> * Update examples/impute/plot_missing_values.py Co-Authored-By: Thomas J Fan <thomasjpfan@gmail.com> * flake8 * reducting number of samples used from california dataset * CLN Removes the need for MissingIndicator * FIX Unrelated bug but is stopping the CI from passing Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org> Co-authored-by: Alexandre Gramfort <alexandre.gramfort@m4x.org> Co-authored-by: Lucy Liu <jliu176@gmail.com> Co-authored-by: Thomas J Fan <thomasjpfan@gmail.com>
Reference Issues/PRs
towards #16155
This PR is towards removing Boston dataset from the Sklearn.
It exchanges Boston dataset for California dataset in missing values example
Before:

After:

What does this implement/fix? Explain your changes.
Any other comments?