FEA Add DecisionBoundaryDisplay#16061
Conversation
|
I appreciate the benefit to reduce code duplication in our examples, but is
this useful outside of teaching?
|
|
I had the same discussion with @amueller a few months ago regarding including this plot function. I think it is useful only in teaching, but we do a bunch of teaching in our examples. |
|
and it will make life much easier for those teaching with sklearn (though that's admittedly a slippery slope). It certainly declutters the examples. |
|
supersedes #5070 |
There was a problem hiding this comment.
+1 for this.
Not just useful for teaching but also for people learning / experimenting on their own, which represent a huge part of our users I believe
A few comments but looks good.
@thomasjpfan There seem however to be a significant difference between the plots in the examples eg
https://scikit-learn.org/stable/auto_examples/classification/plot_classifier_comparison.html
versus
https://88180-843222-gh.circle-artifacts.com/0/doc/auto_examples/classification/plot_classifier_comparison.html
There was a problem hiding this comment.
Codewise, I think this completely fine. Just some questions regarding the API, the location of the function, and what we expect when we are using it.
| estimator : estimator instance | ||
| Trained estimator. | ||
|
|
||
| X : array-like of shape (n_samples, 2) |
There was a problem hiding this comment.
Since we are passing X, maybe we could have an option to make the scatter plot at the same time and to set the x_min/x_max/y_min/y_max limits. Basically, I think that most of the time, we plotting both the boundary and the scatter plot.
There was a problem hiding this comment.
I included this at the start, but I felt like it inflated the API. As in, we may need a scater_kwargs in the plot_decisions_boundary.
In its current form, plotting the scatter plot isn't too bad:
fig, ax = plt.subplots()
plot_decision_boundary(est, X, y, ax=ax)
ax.scatter(X[:, 0], X[:, 1], ...)It's common to want to adjust the alpha when there are too many points (or subsample).
There was a problem hiding this comment.
I think that it would be fine to have scatter_kwargs in fact, I would like to have to write the minimum lines of code, especially xlim, ylim, xlabel, ylabel, alpha, etc., for the largest number of our examples. Then, when one wants to fine-tune it will come back as efficient to write some matplotlib.
| Axes object to plot on. If `None`, a new figure and axes is | ||
| created. | ||
|
|
||
| **kwargs : dict |
There was a problem hiding this comment.
I am also thinking that we should have a way to decorate xlabel and ylabel if we are passing a dataframe using the column names.
| @pytest.fixture(scope="module") | ||
| def data(): | ||
| X, y = make_classification(n_informative=1, n_redundant=1, | ||
| n_clusters_per_class=1, n_features=2) |
There was a problem hiding this comment.
| n_clusters_per_class=1, n_features=2) | |
| n_clusters_per_class=1, n_features=2 | |
| random_state=42) |
sklearn/setup.py
Outdated
| config.add_subpackage('impute/tests') | ||
| config.add_subpackage('inspection') | ||
| config.add_subpackage('inspection/tests') | ||
| config.add_subpackage('inspection/_plot') |
There was a problem hiding this comment.
I don't think this is necessary because I added a setup.py in the inspection module.
|
CI failing |
I'm giving this PR another pass. I see there are still some comments left to address. |
|
I updated this PR addressing most of the remaining concerns. The remaining comment is to support |
4cc6c51 to
fdbd493
Compare
There was a problem hiding this comment.
LGTM The PR still looks good.
There was a problem hiding this comment.
LGTM.
The resolution of the grid seems a little bit lower than the current one in several examples, which makes the boundary a little bit more dented but it's a really subtle difference.
I agree with what was said previously, adding the possibility to pass the pos_label can be done in a separate PR.
Does anyone have second thoughts before we merge ?
Co-authored-by: Guillaume Lemaitre <g.lemaitre58@gmail.com> Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org> Co-authored-by: Loïc Estève <loic.esteve@ymail.com>
This PR adds
BoundaryDecisionDisplayto the inspection module.What does this implement/fix? Explain your changes.
Many examples were touched because they can take advantage of the new function.