-
-
Notifications
You must be signed in to change notification settings - Fork 26k
[MRG + 1] Fixes issue #4304: SpectralClustering should be explicit about include_self #4313
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[MRG + 1] Fixes issue #4304: SpectralClustering should be explicit about include_self #4313
Conversation
We need to check if this changed any behavior, lets say in the examples. |
Can you guide me on how to check the changed behavior and how to document the change? |
We could add that SpectralClustering explicitly passes True for include_self in doc/modules/clustering.rst or neighbors.rst? And/or in the sklearn/neighbors/graph.py? |
@@ -423,7 +423,7 @@ def fit(self, X, y=None): | |||
"set ``affinity=precomputed``.") | |||
|
|||
if self.affinity == 'nearest_neighbors': | |||
connectivity = kneighbors_graph(X, n_neighbors=self.n_neighbors) | |||
connectivity = kneighbors_graph(X, n_neighbors=self.n_neighbors, True) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
these should be explicitly passed by name
@jnothman passed it explicitly by name. |
It seems there are currently not many examples using spectral clustering. Maybe check if http://scikit-learn.org/dev/auto_examples/cluster/plot_cluster_comparison.html changed in your branch, and also http://scikit-learn.org/dev/auto_examples/cluster/plot_lena_segmentation.html |
@amueller - Checked plot_cluster_comparison and plot_lena_segmentation on master and feature branches, the DeprecationWarning was removed in the feature branch where include_self=True, for plot_cluster_comparison. plot_lena_segmentation didn't give any warnings in both cases. But I'm getting a UserWarning in plot_cluster_comparison. UserWarning: Graph is not fully connected, spectral embedding may not work as expected. In master branch: /home/vortex/.local/lib/python3.4/site-packages/sklearn/neighbors/graph.py:36: DeprecationWarning: The behavior of 'kneighbors_graph' when mode='connectivity' will change in version 0.18. Presently, the nearest neighbor of each sample is the sample itself. Beginning in version 0.18, the default behavior will be to exclude each sample from being its own nearest neighbor. To maintain the current behavior, set include_self=True. "behavior, set include_self=True.", DeprecationWarning) /home/vortex/.local/lib/python3.4/site-packages/sklearn/neighbors/graph.py:36: DeprecationWarning: The behavior of 'kneighbors_graph' when mode='connectivity' will change in version 0.18. Presently, the nearest neighbor of each sample is the sample itself. Beginning in version 0.18, the default behavior will be to exclude each sample from being its own nearest neighbor. To maintain the current behavior, set include_self=True. "behavior, set include_self=True.", DeprecationWarning) /home/vortex/.local/lib/python3.4/site-packages/sklearn/manifold/spectral_embedding_.py:215: UserWarning: Graph is not fully connected, spectral embedding may not work as expected. warnings.warn("Graph is not fully connected, spectral embedding" /home/vortex/.local/lib/python3.4/site-packages/sklearn/cluster/hierarchical.py:205: UserWarning: the number of connected components of the connectivity matrix is 2 > 1. Completing it to avoid stopping the tree early. connectivity, n_components = _fix_connectivity(X, connectivity) /home/vortex/.local/lib/python3.4/site-packages/sklearn/cluster/hierarchical.py:440: UserWarning: the number of connected components of the connectivity matrix is 2 > 1. Completing it to avoid stopping the tree early. connectivity, n_components = _fix_connectivity(X, connectivity) /home/vortex/.local/lib/python3.4/site-packages/sklearn/neighbors/graph.py:36: DeprecationWarning: The behavior of 'kneighbors_graph' when mode='connectivity' will change in version 0.18. Presently, the nearest neighbor of each sample is the sample itself. Beginning in version 0.18, the default behavior will be to exclude each sample from being its own nearest neighbor. To maintain the current behavior, set include_self=True. "behavior, set include_self=True.", DeprecationWarning) /home/vortex/.local/lib/python3.4/site-packages/sklearn/manifold/spectral_embedding_.py:215: UserWarning: Graph is not fully connected, spectral embedding may not work as expected. warnings.warn("Graph is not fully connected, spectral embedding" /home/vortex/.local/lib/python3.4/site-packages/sklearn/cluster/hierarchical.py:205: UserWarning: the number of connected components of the connectivity matrix is 3 > 1. Completing it to avoid stopping the tree early. connectivity, n_components = _fix_connectivity(X, connectivity) /home/vortex/.local/lib/python3.4/site-packages/sklearn/cluster/hierarchical.py:440: UserWarning: the number of connected components of the connectivity matrix is 3 > 1. Completing it to avoid stopping the tree early. connectivity, n_components = _fix_connectivity(X, connectivity) /home/vortex/.local/lib/python3.4/site-packages/sklearn/neighbors/graph.py:36: DeprecationWarning: The behavior of 'kneighbors_graph' when mode='connectivity' will change in version 0.18. Presently, the nearest neighbor of each sample is the sample itself. Beginning in version 0.18, the default behavior will be to exclude each sample from being its own nearest neighbor. To maintain the current behavior, set include_self=True. "behavior, set include_self=True.", DeprecationWarning) In feature branch: /home/vortex/.local/lib/python3.4/site-packages/sklearn/manifold/spectral_embedding_.py:213: UserWarning: Graph is not fully connected, spectral embedding may not work as expected. warnings.warn("Graph is not fully connected, spectral embedding" /home/vortex/.local/lib/python3.4/site-packages/sklearn/cluster/hierarchical.py:200: UserWarning: the number of connected components of the connectivity matrix is 2 > 1. Completing it to avoid stopping the tree early. n_components=n_components) /home/vortex/.local/lib/python3.4/site-packages/sklearn/cluster/hierarchical.py:429: UserWarning: the number of connected components of the connectivity matrix is 2 > 1. Completing it to avoid stopping the tree early. n_components=n_components) /home/vortex/.local/lib/python3.4/site-packages/sklearn/manifold/spectral_embedding_.py:213: UserWarning: Graph is not fully connected, spectral embedding may not work as expected. warnings.warn("Graph is not fully connected, spectral embedding" /home/vortex/.local/lib/python3.4/site-packages/sklearn/cluster/hierarchical.py:200: UserWarning: the number of connected components of the connectivity matrix is 3 > 1. Completing it to avoid stopping the tree early. n_components=n_components) /home/vortex/.local/lib/python3.4/site-packages/sklearn/cluster/hierarchical.py:429: UserWarning: the number of connected components of the connectivity matrix is 3 > 1. Completing it to avoid stopping the tree early. n_components=n_components) |
Oh right, so we didn't actually change behavior. Then 👍 for merge. |
+1 as well. |
[MRG + 1] Fixes issue #4304: SpectralClustering should be explicit about include_self
#4304