8000 FEA Add DecisionBoundaryDisplay by thomasjpfan · Pull Request #16061 · scikit-learn/scikit-learn · GitHub
[go: up one dir, main page]

Skip to content

FEA Add DecisionBoundaryDisplay #16061

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 79 commits into from
Mar 29, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
79 commits
Select commit Hold shift + click to select a range
c5b9aa0
WIP
thomasjpfan Dec 2, 2019
a82955a
Merge remote-tracking branch 'upstream/master' into plot_decision_bound
thomasjpfan Dec 13, 2019
61a5959
Merge remote-tracking branch 'upstream/master' into plot_decision_bound
thomasjpfan Dec 16, 2019
bf0dc07
Merge remote-tracking branch 'upstream/master' into plot_decision_bound
thomasjpfan Jan 3, 2020
6dd153b
ENH Completely adds decision boundary
thomasjpfan Jan 6, 2020
ed1cae0
TST Adds more tests
thomasjpfan Jan 6, 2020
c7ee84e
STY Linting
thomasjpfan Jan 6, 2020
c3c9c89
Merge remote-tracking branch 'upstream/master' into plot_decision_bound
thomasjpfan Jan 7, 2020
c52eebd
Merge remote-tracking branch 'upstream/master' into plot_decision_bound
thomasjpfan Jan 8, 2020
a8d4e11
BUG Fix
thomasjpfan Jan 8, 2020
5bd2085
Merge remote-tracking branch 'upstream/master' into plot_decision_bound
thomasjpfan Jan 23, 2020
3c633d6
CLN Update response_method order
thomasjpfan Jan 24, 2020
54dbd55
Merge remote-tracking branch 'upstream/master' into plot_decision_bound
thomasjpfan Jan 24, 2020
ae93c3f
CLN Adds links to external libraries
thomasjpfan Jan 24, 2020
67149a4
CLN Adds reference to quad*
thomasjpfan Jan 24, 2020
a6020c9
Merge remote-tracking branch 'upstream/master' into plot_decision_bound
thomasjpfan Jan 27, 2020
2fc3384
Merge remote-tracking branch 'upstream/master' into plot_decision_bound
thomasjpfan Jan 27, 2020
f84108c
Merge remote-tracking branch 'upstream/master' into plot_decision_bound
thomasjpfan Feb 10, 2020
b525638
CLN Address comments
thomasjpfan Feb 20, 2020
4b913fe
Merge remote-tracking branch 'upstream/master' into plot_decision_bound
thomasjpfan Feb 20, 2020
69fbabf
CLN Address comments
thomasjpfan Feb 20, 2020
a0addef
BUG Fix
thomasjpfan Feb 20, 2020
b5781ec
Merge remote-tracking branch 'upstream/master' into plot_decision_bound
thomasjpfan Feb 21, 2020
030808d
Merge remote-tracking branch 'upstream/master' into plot_decision_bound
thomasjpfan Mar 10, 2020
dd72cd0
Merge remote-tracking branch 'upstream/master' into plot_decision_bound
thomasjpfan Apr 22, 2020
731bc23
CLN Move to utils
thomasjpfan Apr 22, 2020
e1e3df2
BUG Fix
thomasjpfan Apr 22, 2020
afb8594
ENH Move to utils
thomasjpfan Apr 22, 2020
4d4ffe7
BLD Fixes build error
thomasjpfan Apr 22, 2020
5710473
FIX Bug
thomasjpfan Apr 22, 2020
1bfb6cf
CLN Move back to inspection
thomasjpfan Apr 23, 2020
43bef48
Merge remote-tracking branch 'upstream/master' into plot_decision_bound
thomasjpfan Jul 9, 2020
eb044be
DOC Adds whats new
thomasjpfan Jul 9, 2020
0bb66ff
merge master
glemaitre Aug 18, 2020
2738106
Merge remote-tracking branch 'upstream/master' into plot_decision_bound
thomasjpfan Aug 22, 2020
1e3426f
Merge remote-tracking branch 'origin/main' into pr/thomasjpfan/16061
glemaitre Aug 6, 2021
d504414
API move to new plotting API
glemaitre Aug 6, 2021
7fd2181
iter
glemaitre Aug 6, 2021
a69315a
rename file and avoid warning
glemaitre Aug 6, 2021
8235a74
FEA allow to set x-/y-label
glemaitre Aug 6, 2021
5c0273f
DOC fix
glemaitre Aug 6, 2021
f779972
avoid to set axis outside display
glemaitre Aug 6, 2021
b3805d6
DOC add example and see also
glemaitre Aug 6, 2021
dd09fd3
iter
glemaitre Aug 6, 2021
a61ba85
Merge remote-tracking branch 'upstream/main' into plot_decision_bound
thomasjpfan Aug 29, 2021
c55ac8e
REV Revert examples for now
thomasjpfan Aug 29, 2021
a0a7938
Merge branch 'main' into plot_decision_bound
ogrisel Sep 3, 2021
766fc46
Merge remote-tracking branch 'upstream/main' into plot_decision_bound
thomasjpfan Sep 5, 2021
ead2423
Merge remote-tracking branch 'upstream/main' into plot_decision_bound
thomasjpfan Sep 5, 2021
88e62fe
ENH Better validation errors
thomasjpfan Sep 6, 2021
39c5e0f
ENH Remvoe
thomasjpfan Oct 23, 2021
1c1d311
Merge remote-tracking branch 'upstream/main' into plot_decision_bound
thomasjpfan Oct 24, 2021
e92e032
ENH Update avaliable methods
thomasjpfan Oct 24, 2021
756b581
Merge remote-tracking branch 'upstream/main' into plot_decision_bound
thomasjpfan Oct 26, 2021
b99b1ab
Merge remote-tracking branch 'upstream/main' into plot_decision_bound
thomasjpfan Nov 29, 2021
fa4ded7
DOC Move to 1.1
thomasjpfan Nov 29, 2021
d97231b
WIP
thomasjpfan Nov 29, 2021
3c1cb4d
FIX Support string as targets
thomasjpfan Nov 29, 2021
fa66c81
STY Fix black formatting
thomasjpfan Nov 29, 2021
4db21b6
Merge branch 'main' of https://github.com/scikit-learn/scikit-learn i…
lesteve Feb 23, 2022
833492c
Fix whats_new
lesteve Feb 23, 2022
9f9b1d1
DOC Fixes whats nwe
thomasjpfan Feb 23, 2022
3e8162c
ENH Improve auto behavior for multiclass problems
thomasjpfan Feb 23, 2022
60d67d9
FIX Removes unneeded code
thomasjpfan Feb 23, 2022
e2d984f
Merge remote-tracking branch 'upstream/main' into plot_decision_bound
thomasjpfan Feb 24, 2022
741ff08
FIX Adds classes to InductiveClusterer
thomasjpfan Feb 24, 2022
86f24e9
ENH Do not require classes
thomasjpfan Feb 24, 2022
9d00d73
DOC Adds TODO
thomasjpfan Feb 24, 2022
1f2d6e6
ENH Improve error message for unsupported estimators
thomasjpfan Feb 24, 2022
702e446
CLN Remove comment
thomasjpfan Feb 24, 2022
0f92314
TST Fixes test error
thomasjpfan Mar 7, 2022
fdbd493
Merge remote-tracking branch 'upstream/main' into plot_decision_bound
thomasjpfan Mar 7, 2022
d649a54
Merge remote-tracking branch 'upstream/main' into plot_decision_bound
thomasjpfan Mar 13, 2022
b864194
FIX Uses labelencoder
thomasjpfan Mar 13, 2022
2c0492e
Merge remote-tracking branch 'upstream/main' into plot_decision_bound
thomasjpfan Mar 15, 2022
aebbb6b
REV Remove unneeded whats new item
thomasjpfan Mar 15, 2022
666b1be
Update sklearn/inspection/_plot/decision_boundary.py
jeremiedbb Mar 25, 2022
67c8fc8
Update sklearn/inspection/_plot/decision_boundary.py
jeremiedbb Mar 25, 2022
67c633a
Merge branch 'main' into plot_decision_bound
jeremiedbb Mar 25, 2022
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions doc/modules/classes.rst
Original file line number Diff line number Diff line change
Expand Up @@ -657,6 +657,7 @@ Plotting
:toctree: generated/
:template: class.rst

inspection.DecisionBoundaryDisplay
inspection.PartialDependenceDisplay

.. autosummary::
Expand Down
1 change: 1 addition & 0 deletions doc/visualizations.rst
Original file line number Diff line number Diff line change
Expand Up @@ -96,6 +96,7 @@ Display Objects

calibration.CalibrationDisplay
inspection.PartialDependenceDisplay
inspection.DecisionBoundaryDisplay
metrics.ConfusionMatrixDisplay
metrics.DetCurveDisplay
metrics.PrecisionRecallDisplay
Expand Down
4 changes: 4 additions & 0 deletions doc/whats_new/v1.1.rst
Original file line number Diff line number Diff line change
Expand Up @@ -534,6 +534,10 @@ Changelog
:mod:`sklearn.inspection`
.........................

- |Feature| Add a display to plot the boundary decision of a classifier by
using the method :func:`inspection.DecisionBoundaryDisplay.from_estimator`.
:pr:`16061` by `Thomas Fan`_.

- |Enhancement| In
:meth:`~sklearn.inspection.PartialDependenceDisplay.from_estimator` and
:meth:`~sklearn.inspection.PartialDependenceDisplay.from_predictions`, allow
Expand Down
30 changes: 10 additions & 20 deletions examples/classification/plot_classifier_comparison.py
Original file line number Diff line number Diff line change
Expand Up @@ -40,8 +40,7 @@
from sklearn.ensemble import RandomForestClassifier, AdaBoostClassifier
from sklearn.naive_bayes import GaussianNB
from sklearn.discriminant_analysis import QuadraticDiscriminantAnalysis

h = 0.02 # step size in the mesh
from sklearn.inspection import DecisionBoundaryDisplay

names = [
"Nearest Neighbors",
Expand Down Expand Up @@ -95,7 +94,6 @@

x_min, x_max = X[:, 0].min() - 0.5, X[:, 0].max() + 0.5
y_min, y_max = X[:, 1].min() - 0.5, X[:, 1].max() + 0.5
xx, yy = np.meshgrid(np.arange(x_min, x_max, h), np.arange(y_min, y_max, h))

# just plot the dataset first
cm = plt.cm.RdBu
Expand All @@ -109,8 +107,8 @@
ax.scatter(
X_test[:, 0], X_test[:, 1], c=y_test, cmap=cm_bright, alpha=0.6, edgecolors="k"
)
ax.set_xlim(xx.min(), xx.max())
ax.set_ylim(yy.min(), yy.max())
ax.set_xlim(x_min, x_max)
ax.set_ylim(y_min, y_max)
ax.set_xticks(())
ax.set_yticks(())
i += 1
Expand All @@ -120,17 +118,9 @@
ax = plt.subplot(len(datasets), len(classifiers) + 1, i)
clf.fit(X_train, y_train)
score = clf.score(X_test, y_test)

# Plot the decision boundary. For that, we will assign a color to each
# point in the mesh [x_min, x_max]x[y_min, y_max].
if hasattr(clf, "decision_function"):
Z = clf.decision_function(np.c_[xx.ravel(), yy.ravel()])
else:
Z = clf.predict_proba(np.c_[xx.ravel(), yy.ravel()])[:, 1]

# Put the result into a color plot
Z = Z.reshape(xx.shape)
ax.contourf(xx, yy, Z, cmap=cm, alpha=0.8)
DecisionBoundaryDisplay.from_estimator(
clf, X, cmap=cm, alpha=0.8, ax=ax, eps=0.5
)

# Plot the training points
ax.scatter(
Expand All @@ -146,15 +136,15 @@
alpha=0.6,
)

ax.set_xlim(xx.min(), xx.max())
ax.set_ylim(yy.min(), yy.max())
ax.set_xlim(x_min, x_max)
ax.set_ylim(y_min, y_max)
ax.set_xticks(())
ax.set_yticks(())
if ds_cnt == 0:
ax.set_title(name)
ax.text(
xx.max() - 0.3,
yy.min() + 0.3,
x_max - 0.3,
y_min + 0.3,
("%.2f" % score).lstrip("0"),
size=15,
horizontalalignment="right",
Expand Down
15 changes: 5 additions & 10 deletions examples/cluster/plot_inductive_clustering.py
Original file line number Diff line number Diff line change
Expand Up @@ -23,12 +23,12 @@
# Authors: Chirag Nagpal
# Christos Aridas

import numpy as np
import matplotlib.pyplot as plt
from sklearn.base import BaseEstimator, clone
from sklearn.cluster import AgglomerativeClustering
from sklearn.datasets import make_blobs
from sklearn.ensemble import RandomForestClassifier
from sklearn.inspection import DecisionBoundaryDisplay
from sklearn.utils.metaestimators import available_if
from sklearn.utils.validation import check_is_fitted

Expand Down Expand Up @@ -116,19 +116,14 @@ def plot_scatter(X, color, alpha=0.5):
probable_clusters = inductive_learner.predict(X_new)


plt.subplot(133)
ax = plt.subplot(133)
plot_scatter(X, cluster_labels)
plot_scatter(X_new, probable_clusters)

# Plotting decision regions
x_min, x_max = X[:, 0].min() - 1, X[:, 0].max() + 1
y_min, y_max = X[:, 1].min() - 1, X[:, 1].max() + 1
xx, yy = np.meshgrid(np.arange(x_min, x_max, 0.1), np.arange(y_min, y_max, 0.1))

Z = inductive_learner.predict(np.c_[xx.ravel(), yy.ravel()])
Z = Z.reshape(xx.shape)

plt.contourf(xx, yy, Z, alpha=0.4)
DecisionBoundaryDisplay.from_estimator(
inductive_learner, X, response_method="predict", alpha=0.4, ax=ax
)
plt.title("Classify unknown instances")

plt.show()
24 changes: 13 additions & 11 deletions examples/ensemble/plot_adaboost_twoclass.py
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,7 @@
from sklearn.ensemble import AdaBoostClassifier
from sklearn.tree import DecisionTreeClassifier
from sklearn.datasets import make_gaussian_quantiles
from sklearn.inspection import DecisionBoundaryDisplay


# Construct dataset
Expand All @@ -53,16 +54,18 @@
plt.figure(figsize=(10, 5))

# Plot the decision boundaries
plt.subplot(121)
x_min, x_max = X[:, 0].min() - 1, X[:, 0].max() + 1
y_min, y_max = X[:, 1].min() - 1, X[:, 1].max() + 1
xx, yy = np.meshgrid(
np.arange(x_min, x_max, plot_step), np.arange(y_min, y_max, plot_step)
ax = plt.subplot(121)
disp = DecisionBoundaryDisplay.from_estimator(
bdt,
X,
cmap=plt.cm.Paired,
response_method="predict",
ax=ax,
xlabel="x",
ylabel="y",
)

Z = bdt.predict(np.c_[xx.ravel(), yy.ravel()])
Z = Z.reshape(xx.shape)
cs = plt.contourf(xx, yy, Z, cmap=plt.cm.Paired)
x_min, x_max = disp.xx0.min(), disp.xx0.max()
y_min, y_max = disp.xx1.min(), disp.xx1.max()
plt.axis("tight")

# Plot the training points
Expand All @@ -80,8 +83,7 @@
plt.xlim(x_min, x_max)
plt.ylim(y_min, y_max)
plt.legend(loc="upper right")
plt.xlabel("x")
plt.ylabel("y")

plt.title("Decision Boundary")

# Plot the two-class decision scores
Expand Down
15 changes: 4 additions & 11 deletions examples/ensemble/plot_voting_decision_regions.py
Original file line number Diff line number Diff line change
Expand Up @@ -25,14 +25,14 @@

from itertools import product

import numpy as np
import matplotlib.pyplot as plt

from sklearn import datasets
from sklearn.tree import DecisionTreeClassifier
from sklearn.neighbors import KNeighborsClassifier
from sklearn.svm import SVC
from sklearn.ensemble import VotingClassifier
from sklearn.inspection import DecisionBoundaryDisplay

# Loading some example data
iris = datasets.load_iris()
Expand All @@ -55,22 +55,15 @@
eclf.fit(X, y)

# Plotting decision regions
x_min, x_max = X[:, 0].min() - 1, X[:, 0].max() + 1
y_min, y_max = X[:, 1].min() - 1, X[:, 1].max() + 1
xx, yy = np.meshgrid(np.a EED3 range(x_min, x_max, 0.1), np.arange(y_min, y_max, 0.1))

f, axarr = plt.subplots(2, 2, sharex="col", sharey="row", figsize=(10, 8))

for idx, clf, tt in zip(
product([0, 1], [0, 1]),
[clf1, clf2, clf3, eclf],
["Decision Tree (depth=4)", "KNN (k=7)", "Kernel SVM", "Soft Voting"],
):

Z = clf.predict(np.c_[xx.ravel(), yy.ravel()])
Z = Z.reshape(xx.shape)

axarr[idx[0], idx[1]].contourf(xx, yy, Z, alpha=0.4)
DecisionBoundaryDisplay.from_estimator(
clf, X, alpha=0.4, ax=axarr[idx[0], idx[1]], response_method="predict"
)
axarr[idx[0], idx[1]].scatter(X[:, 0], X[:, 1], c=y, s=20, edgecolor="k")
axarr[idx[0], idx[1]].set_title(tt)

Expand Down
32 changes: 15 additions & 17 deletions examples/linear_model/plot_iris_logistic.py
Original file line number Diff line number Diff line change
Expand Up @@ -15,10 +15,10 @@
# Modified for documentation by Jaques Grobler
# License: BSD 3 clause

import numpy as np
import matplotlib.pyplot as plt
from sklearn.linear_model import LogisticRegression
from sklearn import datasets
from sklearn.inspection import DecisionBoundaryDisplay

# import some data to play with
iris = datasets.load_iris()
Expand All @@ -29,26 +29,24 @@
logreg = LogisticRegression(C=1e5)
logreg.fit(X, Y)

# Plot the decision boundary. For that, we will assign a color to each
# point in the mesh [x_min, x_max]x[y_min, y_max].
x_min, x_max = X[:, 0].min() - 0.5, X[:, 0].max() + 0.5
y_min, y_max = X[:, 1].min() - 0.5, X[:, 1].max() + 0.5
h = 0.02 # step size in the mesh
xx, yy = np.meshgrid(np.arange(x_min, x_max, h), np.arange(y_min, y_max, h))
Z = logreg.predict(np.c_[xx.ravel(), yy.ravel()])

# Put the result into a color plot
Z = Z.reshape(xx.shape)
plt.figure(1, figsize=(4, 3))
plt.pcolormesh(xx, yy, Z, cmap=plt.cm.Paired)
_, ax = plt.subplots(figsize=(4, 3))
DecisionBoundaryDisplay.from_estimator(
logreg,
X,
cmap=plt.cm.Paired,
ax=ax,
response_method="predict",
plot_method="pcolormesh",
shading="auto",
xlabel="Sepal length",
ylabel="Sepal width",
eps=0.5,
)

# Plot also the training points
plt.scatter(X[:, 0], X[:, 1], c=Y, edgecolors="k", cmap=plt.cm.Paired)
plt.xlabel("Sepal length")
plt.ylabel("Sepal width")

plt.xlim(xx.min(), xx.max())
plt.ylim(yy.min(), yy.max())

plt.xticks(())
plt.yticks(())

Expand Down
18 changes: 5 additions & 13 deletions examples/linear_model/plot_logistic_multinomial.py
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,7 @@
import matplotlib.pyplot as plt
from sklearn.datasets import make_blobs
from sklearn.linear_model import LogisticRegression
from sklearn.inspection import DecisionBoundaryDisplay

# make 3-class dataset for classification
centers = [[-5, 0], [0, 1.5], [5, -1]]
Expand All @@ -31,19 +32,10 @@
# print the training scores
print("training score : %.3f (%s)" % (clf.score(X, y), multi_class))

# create a mesh to plot in
h = 0.02 # step size in the mesh
x_min, x_max = X[:, 0].min() - 1, X[:, 0].max() + 1
y_min, y_max = X[:, 1].min() - 1, X[:, 1].max() + 1
xx, yy = np.meshgrid(np.arange(x_min, x_max, h), np.arange(y_min, y_max, h))

# Plot the decision boundary. For that, we will assign a color to each
# point in the mesh [x_min, x_max]x[y_min, y_max].
Z = clf.predict(np.c_[xx.ravel(), yy.ravel()])
# Put the result into a color plot
Z = Z.reshape(xx.shape)
plt.figure()
plt.contourf(xx, yy, Z, cmap=plt.cm.Paired)
_, ax = plt.subplots()
DecisionBoundaryDisplay.from_estimator(
clf, X, response_method="predict", cmap=plt.cm.Paired, ax=ax
)
plt.title("Decision surface of LogisticRegression (%s)" % multi_class)
plt.axis("tight")

Expand Down
25 changes: 11 additions & 14 deletions examples/linear_model/plot_sgd_iris.py
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,7 @@
import matplotlib.pyplot as plt
from sklearn import datasets
from sklearn.linear_model import SGDClassifier
from sklearn.inspection import DecisionBoundaryDisplay

# import some data to play with
iris = datasets.load_iris()
Expand All @@ -35,21 +36,17 @@
std = X.std(axis=0)
X = (X - mean) / std

h = 0.02 # step size in the mesh

clf = SGDClassifier(alpha=0.001, max_iter=100).fit(X, y)

# create a mesh to plot in
x_min, x_max = X[:, 0].min() - 1, X[:, 0].max() + 1
y_min, y_max = X[:, 1].min() - 1, X[:, 1].max() + 1
xx, yy = np.meshgrid(np.arange(x_min, x_max, h), np.arange(y_min, y_max, h))

# Plot the decision boundary. For that, we will assign a color to each
# point in the mesh [x_min, x_max]x[y_min, y_max 713D ].
Z = clf.predict(np.c_[xx.ravel(), yy.ravel()])
# Put the result into a color plot
Z = Z.reshape(xx.shape)
cs = plt.contourf(xx, yy, Z, cmap=plt.cm.Paired)
ax = plt.gca()
DecisionBoundaryDisplay.from_estimator(
clf,
X,
cmap=plt.cm.Paired,
ax=ax,
response_method="predict",
xlabel=iris.feature_names[0],
ylabel=iris.feature_names[1],
)
plt.axis("tight")

# Plot also the training points
Expand Down
Loading
0