8000 Exchanging Boston for california dataset in plot missing values by maikia · Pull Request #16513 · scikit-learn/scikit-learn · GitHub
[go: up one dir, main page]

Skip to content

Exchanging Boston for california dataset in plot missing values #16513

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 73 commits into from
Apr 28, 2020

Conversation

maikia
Copy link
Contributor
@maikia maikia commented Feb 21, 2020

Reference Issues/PRs

towards #16155

This PR is towards removing Boston dataset from the Sklearn.
It exchanges Boston dataset for California dataset in missing values example

Before:
before

After:
after

What does this implement/fix? Explain your changes.

Any other comments?

@maikia maikia changed the title [WIP] Exchanging Boston for california dataset in plot missing values Exchanging Boston for california dataset in plot missing values Feb 24, 2020
@maikia maikia requested a review from ogrisel February 24, 2020 15:19
Copy link
Member
@ogrisel ogrisel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe the first paragraph would mention the list of strategies investigated in this example:

  • imputation by the constant 0 value
  • imputation by the mean value of each feature combined with a missing-ness indicator auxiliary variable
  • k nearest neighbor imputation
  • iterative imputation

But otherwise LGTM. Thanks very much.

@ogrisel
Copy link
Member
ogrisel commented Feb 26, 2020

The doc artifact link is broken but the rendered example is still online:

https://85-171681175-gh.circle-artifacts.com/0/doc/auto_examples/impute/plot_missing_values.html

maikia and others added 3 commits March 11, 2020 16:08
Co-Authored-By: Olivier Grisel <olivier.grisel@ensta.org>
maikia and others added 9 commits April 16, 2020 18:26
Co-Authored-By: Lucy Liu <jliu176@gmail.com>
Co-Authored-By: Lucy Liu <jliu176@gmail.com>
Co-Authored-By: Lucy Liu <jliu176@gmail.com>
Co-Authored-By: Lucy Liu <jliu176@gmail.com>
Co-Authored-By: Lucy Liu <jliu176@gmail.com>
Co-Authored-By: Lucy Liu <jliu176@gmail.com>
Co-Authored-By: Lucy Liu <jliu176@gmail.com>
@cmarmo
Copy link
Contributor
cmarmo commented Apr 24, 2020

@scikit-learn/core-devs time to merge this one? Thanks!

Copy link
Member
@thomasjpfan thomasjpfan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In a follow up PR, it would be nice to rename get_scores_for_imputer into get_mse_for_imputer and do the negation in the function. This way, we would not need to run:

mses_diabetes = mses_diabetes * -1
mses_california = mses_california * -1

before plotting, which can be a little confusing.

@thomasjpfan thomasjpfan merged commit fb76de7 into scikit-learn:master Apr 28, 2020
@thomasjpfan
Copy link
Member

Thank you @maikia !

For release �#17010

@maikia maikia deleted the boston_plot_missing_values branch April 28, 2020 06:43
adrinjalali pushed a commit that referenced this pull request Apr 30, 2020
…16513)

* first few comments

* added new california dataset

* removed boston dataset from the file

* updating the DOCs

* adding a DOC for calculating the error

* exchanged the order started writing functions on scoring the imputers

* finished writing functions for imputers

* finished writing functions and started on DOcs

* working on the DOCs for imputers

* cleaning up

* flake8

* cleaning up

* cleaning up

* restructuring the document

* further text restructuring

* text restructuring

* flake8

* reformatting

* flake8

* Update examples/impute/plot_missing_values.py

Co-Authored-By: Olivier Grisel <olivier.grisel@ensta.org>

* updated the intro

* improve bullet point rendering

* spelling

* Update examples/impute/plot_missing_values.py

Co-Authored-By: Alexandre Gramfort <alexandre.gramfort@m4x.org>

* Update examples/impute/plot_missing_values.py

Co-Authored-By: Alexandre Gramfort <alexandre.gramfort@m4x.org>

* Update examples/impute/plot_missing_values.py

Co-Authored-By: Alexandre Gramfort <alexandre.gramfort@m4x.org>

* Update examples/impute/plot_missing_values.py

Co-Authored-By: Alexandre Gramfort <alexandre.gramfort@m4x.org>

* Update examples/impute/plot_missing_values.py

Co-Authored-By: Alexandre Gramfort <alexandre.gramfort@m4x.org>

* Update examples/impute/plot_missing_values.py

Co-Authored-By: Alexandre Gramfort <alexandre.gramfort@m4x.org>

* changed the naming

* restructuring text

* Update examples/impute/plot_missing_values.py

Co-Authored-By: Alexandre Gramfort <alexandre.gramfort@m4x.org>

* flake8

* Update examples/impute/plot_missing_values.py

Co-Authored-By: Alexandre Gramfort <alexandre.gramfort@m4x.org>

* Update examples/impute/plot_missing_values.py

Co-Authored-By: Alexandre Gramfort <alexandre.gramfort@m4x.org>

* Update examples/impute/plot_missing_values.py

Co-Authored-By: Alexandre Gramfort <alexandre.gramfort@m4x.org>

* Update examples/impute/plot_missing_values.py

Co-Authored-By: Alexandre Gramfort <alexandre.gramfort@m4x.org>

* Update examples/impute/plot_missing_values.py

Co-Authored-By: Alexandre Gramfort <alexandre.gramfort@m4x.org>

* changing missing values from 0 to nan

* Update examples/impute/plot_missing_values.py

Co-Authored-By: Alexandre Gramfort <alexandre.gramfort@m4x.org>

* REGRESSOR to regressor

* Update examples/impute/plot_missing_values.py

Co-Authored-By: Lucy Liu <jliu176@gmail.com>

* Update examples/impute/plot_missing_values.py

Co-Authored-By: Lucy Liu <jliu176@gmail.com>

* Update examples/impute/plot_missing_values.py

Co-Authored-By: Lucy Liu <jliu176@gmail.com>

* Update examples/impute/plot_missing_values.py

Co-Authored-By: Lucy Liu <jliu176@gmail.com>

* Update examples/impute/plot_missing_values.py

Co-Authored-By: Lucy Liu <jliu176@gmail.com>

* Update examples/impute/plot_missing_values.py

Co-Authored-By: Lucy Liu <jliu176@gmail.com>

* Update examples/impute/plot_missing_values.py

Co-Authored-By: Lucy Liu <jliu176@gmail.com>

* Update examples/impute/plot_missing_values.py

Co-Authored-By: Lucy Liu <jliu176@gmail.com>

* Update examples/impute/plot_missing_values.py

Co-Authored-By: Lucy Liu <jliu176@gmail.com>

* Update examples/impute/plot_missing_values.py

Co-Authored-By: Lucy Liu <jliu176@gmail.com>

* Update examples/impute/plot_missing_values.py

Co-Authored-By: Lucy Liu <jliu176@gmail.com>

* Update examples/impute/plot_missing_values.py

Co-Authored-By: Lucy Liu <jliu176@gmail.com>

* flake8

* Update examples/impute/plot_missing_values.py

Co-Authored-By: Thomas J Fan <thomasjpfan@gmail.com>

* Update examples/impute/plot_missing_values.py

Co-Authored-By: Thomas J Fan <thomasjpfan@gmail.com>

* Update examples/impute/plot_missing_values.py

Co-Authored-By: Thomas J Fan <thomasjpfan@gmail.com>

* Update examples/impute/plot_missing_values.py

Co-Authored-By: Thomas J Fan <thomasjpfan@gmail.com>

* flake8

* reducting number of samples used from california dataset

* CLN Removes the need for MissingIndicator

* FIX Unrelated bug but is stopping the CI from passing

Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org>
Co-authored-by: Alexandre Gramfort <alexandre.gramfort@m4x.org>
Co-authored-by: Lucy Liu <jliu176@gmail.com>
Co-authored-by: Thomas J Fan <thomasjpfan@gmail.com>
gio8tisu pushed a commit to gio8tisu/scikit-learn that referenced this pull request May 15, 2020
…cikit-learn#16513)

* first few comments

* added new california dataset

* removed boston dataset from the file

* updating the DOCs

* adding a DOC for calculating the error

* exchanged the order started writing functions on scoring the imputers

* finished writing functions for imputers

* finished writing functions and started on DOcs

* working on the DOCs for imputers

* cleaning up

* flake8

* cleaning up

* cleaning up

* restructuring the document

* further text restructuring

* text restructuring

* flake8

* reformatting

* flake8

* Update examples/impute/plot_missing_values.py

Co-Authored-By: Olivier Grisel <olivier.grisel@ensta.org>

* updated the intro

* improve bullet point rendering

* spelling

* Update examples/impute/plot_missing_values.py

Co-Authored-By: Alexandre Gramfort <alexandre.gramfort@m4x.org>

* Update examples/impute/plot_missing_values.py

Co-Authored-By: Alexandre Gramfort <alexandre.gramfort@m4x.org>

* Update examples/impute/plot_missing_values.py

Co-Authored-By: Alexandre Gramfort <alexandre.gramfort@m4x.org>

* Update examples/impute/plot_missing_values.py

Co-Authored-By: Alexandre Gramfort <alexandre.gramfort@m4x.org>

* Update examples/impute/plot_missing_values.py

Co-Authored-By: Alexandre Gramfort <alexandre.gramfort@m4x.org>

* Update examples/impute/plot_missing_values.py

Co-Authored-By: Alexandre Gramfort <alexandre.gramfort@m4x.org>

* changed the naming

* restructuring text

* Update examples/impute/plot_missing_values.py

Co-Authored-By: Alexandre Gramfort <alexandre.gramfort@m4x.org>

* flake8

* Update examples/impute/plot_missing_values.py

Co-Authored-By: Alexandre Gramfort <alexandre.gramfort@m4x.org>

* Update examples/impute/plot_missing_values.py

Co-Authored-By: Alexandre Gramfort <alexandre.gramfort@m4x.org>

* Update examples/impute/plot_missing_values.py

Co-Authored-By: Alexandre Gramfort <alexandre.gramfort@m4x.org>

* Update examples/impute/plot_missing_values.py

Co-Authored-By: Alexandre Gramfort <alexandre.gramfort@m4x.org>

* Update examples/impute/plot_missing_values.py

Co-Authored-By: Alexandre Gramfort <alexandre.gramfort@m4x.org>

* changing missing values from 0 to nan

* Update examples/impute/plot_missing_values.py

Co-Authored-By: Alexandre Gramfort <alexandre.gramfort@m4x.org>

* REGRESSOR to regressor

* Update examples/impute/plot_missing_values.py

Co-Authored-By: Lucy Liu <jliu176@gmail.com>

* Update examples/impute/plot_missing_values.py

Co-Authored-By: Lucy Liu <jliu176@gmail.com>

* Update examples/impute/plot_missing_values.py

Co-Authored-By: Lucy Liu <jliu176@gmail.com>

* Update examples/impute/plot_missing_values.py

Co-Authored-By: Lucy Liu <jliu176@gmail.com>

* Update examples/impute/plot_missing_values.py

Co-Authored-By: Lucy Liu <jliu176@gmail.com>

* Update examples/impute/plot_missing_values.py

Co-Authored-By: Lucy Liu <jliu176@gmail.com>

* Update examples/impute/plot_missing_values.py

Co-Authored-By: Lucy Liu <jliu176@gmail.com>

* Update examples/impute/plot_missing_values.py

Co-Authored-By: Lucy Liu <jliu176@gmail.com>

* Update examples/impute/plot_missing_values.py

Co-Authored-By: Lucy Liu <jliu176@gmail.com>

* Update examples/impute/plot_missing_values.py

Co-Authored-By: Lucy Liu <jliu176@gmail.com>

* Update examples/impute/plot_missing_values.py

Co-Authored-By: Lucy Liu <jliu176@gmail.com>

* Update examples/impute/plot_missing_values.py

Co-Authored-By: Lucy Liu <jliu176@gmail.com>

* flake8

* Update examples/impute/plot_missing_values.py

Co-Authored-By: Thomas J Fan <thomasjpfan@gmail.com>

* Update examples/impute/plot_missing_values.py

Co-Authored-By: Thomas J Fan <thomasjpfan@gmail.com>

* Update examples/impute/plot_missing_values.py

Co-Authored-By: Thomas J Fan <thomasjpfan@gmail.com>

* Update examples/impute/plot_missing_values.py

Co-Authored-By: Thomas J Fan <thomasjpfan@gmail.com>

* flake8

* reducting number of samples used from california dataset

* CLN Removes the need for MissingIndicator

* FIX Unrelated bug but is stopping the CI from passing

Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org>
Co-authored-by: Alexandre Gramfort <alexandre.gramfort@m4x.org>
Co-authored-by: Lucy Liu <jliu176@gmail.com>
Co-authored-by: Thomas J Fan <thomasjpfan@gmail.com>
viclafargue pushed a commit to viclafargue/scikit-learn that referenced this pull request Jun 26, 2020
…cikit-learn#16513)

* first few comments

* added new california dataset

* removed boston dataset from the file

* updating the DOCs

* adding a DOC for calculating the error

* exchanged the order started writing functions on scoring the imputers

* finished writing functions for imputers

* finished writing functions and started on DOcs

* working on the DOCs for imputers

* cleaning up

* flake8

* cleaning up

* cleaning up

* restructuring the document

* further text restructuring

* text restructuring

* flake8

* reformatting

* flake8

* Update examples/impute/plot_missing_values.py

Co-Authored-By: Olivier Grisel <olivier.grisel@ensta.org>

* updated the intro

* improve bullet point rendering

* spelling

* Update examples/impute/plot_missing_values.py

Co-Authored-By: Alexandre Gramfort <alexandre.gramfort@m4x.org>

* Update examples/impute/plot_missing_values.py

Co-Authored-By: Alexandre Gramfort <alexandre.gramfort@m4x.org>

* Update examples/impute/plot_missing_values.py

Co-Authored-By: Alexandre Gramfort <alexandre.gramfort@m4x.org>

* Update examples/impute/plot_missing_values.py

Co-Authored-By: Alexandre Gramfort <alexandre.gramfort@m4x.org>

* Update examples/impute/plot_missing_values.py

Co-Authored-By: Alexandre Gramfort <alexandre.gramfort@m4x.org>

* Update examples/impute/plot_missing_values.py

Co-Authored-By: Alexandre Gramfort <alexandre.gramfort@m4x.org>

* changed the naming

* restructuring text

* Update examples/impute/plot_missing_values.py

Co-Authored-By: Alexandre Gramfort <alexandre.gramfort@m4x.org>

* flake8

* Update examples/impute/plot_missing_values.py

Co-Authored-By: Alexandre Gramfort <alexandre.gramfort@m4x.org>

* Update examples/impute/plot_missing_values.py

Co-Authored-By: Alexandre Gramfort <alexandre.gramfort@m4x.org>

* Update examples/impute/plot_missing_values.py

Co-Authored-By: Alexandre Gramfort <alexandre.gramfort@m4x.org>

* Update examples/impute/plot_missing_values.py

Co-Authored-By: Alexandre Gramfort <alexandre.gramfort@m4x.org>

* Update examples/impute/plot_missing_values.py

Co-Authored-By: Alexandre Gramfort <alexandre.gramfort@m4x.org>

* changing missing values from 0 to nan

* Update examples/impute/plot_missing_values.py

Co-Authored-By: Alexandre Gramfort <alexandre.gramfort@m4x.org>

* REGRESSOR to regressor

* Update examples/impute/plot_missing_values.py

Co-Authored-By: Lucy Liu <jliu176@gmail.com>

* Update examples/impute/plot_missing_values.py

Co-Authored-By: Lucy Liu <jliu176@gmail.com>

* Update examples/impute/plot_missing_values.py

Co-Authored-By: Lucy Liu <jliu176@gmail.com>

* Update examples/impute/plot_missing_values.py

Co-Authored-By: Lucy Liu <jliu176@gmail.com>

* Update examples/impute/plot_missing_values.py

Co-Authored-By: Lucy Liu <jliu176@gmail.com>

* Update examples/impute/plot_missing_values.py

Co-Authored-By: Lucy Liu <jliu176@gmail.com>

* Update examples/impute/plot_missing_values.py

Co-Authored-By: Lucy Liu <jliu176@gmail.com>

* Update examples/impute/plot_missing_values.py

Co-Authored-By: Lucy Liu <jliu176@gmail.com>

* Update examples/impute/plot_missing_values.py

Co-Authored-By: Lucy Liu <jliu176@gmail.com>

* Update examples/impute/plot_missing_values.py

Co-Authored-By: Lucy Liu <jliu176@gmail.com>

* Update examples/impute/plot_missing_values.py

Co-Authored-By: Lucy Liu <jliu176@gmail.com>

* Update examples/impute/plot_missing_values.py

Co-Authored-By: Lucy Liu <jliu176@gmail.com>

* flake8

* Update examples/impute/plot_missing_values.py

Co-Authored-By: Thomas J Fan <thomasjpfan@gmail.com>

* Update examples/impute/plot_missing_values.py

Co-Authored-By: Thomas J Fan <thomasjpfan@gmail.com>

* Update examples/impute/plot_missing_values.py

Co-Authored-By: Thomas J Fan <thomasjpfan@gmail.com>

* Update examples/impute/plot_missing_values.py

Co-Authored-By: Thomas J Fan <thomasjpfan@gmail.com>

* flake8

* reducting number of samples used from california dataset

* CLN Removes the need for MissingIndicator

* FIX Unrelated bug but is stopping the CI from passing

Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org>
Co-authored-by: Alexandre Gramfort <alexandre.gramfort@m4x.org>
Co-authored-by: Lucy Liu <jliu176@gmail.com>
Co-authored-by: Thomas J Fan <thomasjpfan@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants
0