10000 Add `show_data=False` option to `DecisionBoundaryDisplay.from_estimator` · Issue #33094 · scikit-learn/scikit-learn · GitHub
[go: up one dir, main page]

Skip to content

Add show_data=False option to DecisionBoundaryDisplay.from_estimator #33094

@ogrisel

Description

@ogrisel

Describe the workflow you want to enable

While reviewing @AnneBeyer's #33015 PR, I thought it would be nice to be able to display the decision boundaries of a classifier along with the data points with consistent color maps (at least for classifiers, outlier detectors and regressors) instead of having to a manually scatter plot on the same ax instance.

Describe your proposed solution

Add a new show_data boolean kwarg to the from_estimator method that already accepts the X values as argument. The from_estimator method would then call estimator.predict(X) to find the values of the target to be used to set the colors of the dots of the scatter plot.

The from_estimator could also take an optional y argument with true class labels. When provided, the scatter plot enabled from the show_data=True param of from_estimator would use the provided y values instead of using estimator.predict(X).

The __init__ of the display would then also accept the X and y values.

Ideally, the legend would make it explicit whether the scatter dots represent predictions or true labels, and could be overridden by a data_label kwarg passed to from_estimator.

Maybe show_data=True could even be the default.

Describe alternatives you've considered, if relevant

While it's possible to do scatter plot manually, it's quite verbose and not necessarily trivial to understand if the colors used for the two plots are expected to be consistent or not.

I think people often want to overlay the training set with the decision boundary when generating such plots for educational purposes.

Additional context

@AnneBeyer if you think this is a good feature to add, would be interested in contributing it (as a follow-up to #33015 and related PRs)?

Also cc @lucyleeow @glemaitre in case you already discussed this in the past.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions

      0