8000 ENH support for missing values in ExtraTrees · Issue #27931 · scikit-learn/scikit-learn · GitHub
[go: up one dir, main page]

Skip to content
ENH support for missing values in ExtraTrees  #27931
Closed
@mglowacki100

Description

@mglowacki100

Describe the workflow you want to enable

Inspired by #26391 I think that support for missing values for ExtraTrees regressor and classifier should/could also be provided.

Describe your proposed solution

I think a foundational work is already provided by @thomasjpfan in #26391 and besides tests and documentation to enable nan handling it is enough to modify sklearn/tree/_classes.py:
For ExtraTreeRegressor add method:

 def _more_tags(self):
        # XXX: nan is only support for dense arrays, but we set this for common test to
        # pass, specifically: check_estimators_nan_inf
        allow_nan = self.criterion in {
            "squared_error",
            "friedman_mse",
            "poisson",
        }
        return {"allow_nan": allow_nan}

For ExtraTreeClassifier add method:

def _more_tags(self):
        # XXX: nan is only support for dense arrays, but we set this for common test to
        # pass, specifically: check_estimators_nan_inf
        allow_nan = self.criterion in {
            "gini",
            "log_loss",
            "entropy",
        }
        return {"multilabel": True, "allow_nan": allow_nan}

I've run the code locally, and it appears to be functioning as expected. However, I must emphasize that my testing was not exhaustive, and I might have overlooked some obvious aspects.

Describe alternatives you've considered, if relevant

No response

Additional context

No response

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    Status

    Done

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions

      0