8000 FEA Support missing-values in `ExtraTrees*` by adam2392 · Pull Request #28268 · scikit-learn/scikit-learn · GitHub
[go: up one dir, main page]

Skip to content

FEA Support missing-values in ExtraTrees* #28268

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 171 commits into from
Jul 10, 2024

Conversation

adam2392
Copy link
Member
@adam2392 adam2392 commented Jan 25, 2024

Reference Issues/PRs

Follow-up to: #27966

Closes: #27931

What does this implement/fix? Explain your changes.

  • Adds support and tests for missing-values in the ExtraTrees*

Any other comments?

This should be a very easy review as this just activates the usage of missing value support and adds unit-test parameterizations for the ExtraTree Forest. Just requesting reviews from @thomasjpfan, @glemaitre and @OmarManzoor since you all were able to review the code in #27966. Hope this is easy :)

Waiting for #27966 to be merged first. This just shows the relevant code changes that introduces the missing-value support overall for ExtraTrees. Note the bulk of the work for going from tree->forest missing-value support was added in #26391. The main change here is to introduce certain unit tests.

  • benchmark script ran to demonstrate ExtraTrees* perform much better on data with missing values compared to naive imputation on main
  • benchmark script to compare ExtraTree* compared to naive imputation on main
Benchmarks on ExtraTrees demonstrating there is no runtime regression

results_image

Signed-off-by: Adam Li <adam2392@gmail.com>
Signed-off-by: Adam Li <adam2392@gmail.com>
Signed-off-by: Adam Li <adam2392@gmail.com>
Signed-off-by: Adam Li <adam2392@gmail.com>
Signed-off-by: Adam Li <adam2392@gmail.com>
Signed-off-by: Adam Li <adam2392@gmail.com>
Signed-off-by: Adam Li <adam2392@gmail.com>
Signed-off-by: Adam Li <adam2392@gmail.com>
Signed-off-by: Adam Li <adam2392@gmail.com>
Signed-off-by: Adam Li <adam2392@gmail.com>
Signed-off-by: Adam Li <adam2392@gmail.com>
Signed-off-by: Adam Li <adam2392@gmail.com>
Signed-off-by: Adam Li <adam2392@gmail.com>
Signed-off-by: Adam Li <adam2392@gmail.com>
Signed-off-by: Adam Li <adam2392@gmail.com>
Signed-off-by: Adam Li <adam2392@gmail.com>
Copy link
github-actions bot commented Jan 25, 2024

✔️ Linting Passed

All linting checks passed. Your pull request is in excellent shape! ☀️

Generated for commit: 963de46. Link to the linter CI: here

adam2392 added 13 commits July 6, 2024 11:40
Signed-off-by: Adam Li <adam2392@gmail.com>
Signed-off-by: Adam Li <adam2392@gmail.com>
Copy link
Contributor
@OmarManzoor OmarManzoor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks @adam2392. Would it be suitable for ExtraTreesRegressor to be added to test_non_supported_criterion_raises_error_with_missing_values?

adam2392 and others added 4 commits July 9, 2024 08:09
Co-authored-by: Omar Salman <omar.salman2007@gmail.com>
Co-authored-by: Omar Salman <omar.salman2007@gmail.com>
Signed-off-by: Adam Li <adam2392@gmail.com>
@adam2392
Copy link
Member Author
adam2392 commented Jul 9, 2024

Good point @OmarManzoor!

Added in 1441925

Signed-off-by: Adam Li <adam2392@gmail.com>
Copy link
Member
@thomasjpfan thomasjpfan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Otherwise, LGTM

@OmarManzoor OmarManzoor enabled auto-merge (squash) July 10, 2024 11:49
@OmarManzoor OmarManzoor merged commit 775587b into scikit-learn:main Jul 10, 2024
28 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

ENH support for missing values in ExtraTrees
4 participants
0