8000 [MRG + 1] Fixes issue #4304: SpectralClustering should be explicit about include_self by vinayak-mehta · Pull Request #4313 · scikit-learn/scikit-learn · GitHub
[go: up one dir, main page]

Skip to content

[MRG + 1] Fixes issue #4304: SpectralClustering should be explicit about include_self #4313

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Mar 2, 2015

Conversation

vinayak-mehta
Copy link
Contributor

@amueller
Copy link
Member
amueller commented Mar 2, 2015

We need to check if this changed any behavior, lets say in the examples.
Also, we might need to document the change somehow.

@vinayak-mehta
Copy link
Contributor Author

Can you guide me on how to check the changed behavior and how to document the change?

@vinayak-mehta
Copy link
Contributor Author

We could add that SpectralClustering explicitly passes True for include_self in doc/modules/clustering.rst or neighbors.rst? And/or in the sklearn/neighbors/graph.py?

@@ -423,7 +423,7 @@ def fit(self, X, y=None):
"set ``affinity=precomputed``.")

if self.affinity == 'nearest_neighbors':
connectivity = kneighbors_graph(X, n_neighbors=self.n_neighbors)
connectivity = kneighbors_graph(X, n_neighbors=self.n_neighbors, True)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these should be explicitly passed by name

@vinayak-mehta
Copy link
Contributor Author

@jnothman passed it explicitly by name.

@amueller
Copy link
Member
amueller commented Mar 2, 2015

It seems there are currently not many examples using spectral clustering. Maybe check if http://scikit-learn.org/dev/auto_examples/cluster/plot_cluster_comparison.html changed in your branch, and also http://scikit-learn.org/dev/auto_examples/cluster/plot_lena_segmentation.html

@vinayak-mehta
Copy link
Contributor Author

@amueller - Checked plot_cluster_comparison and plot_lena_segmentation on master and feature branches, the DeprecationWarning was removed in the feature branch where include_self=True, for plot_cluster_comparison. plot_lena_segmentation didn't give any warnings in both cases. But I'm getting a UserWarning in plot_cluster_comparison.

UserWarning: Graph is not fully connected, spectral embedding may not work as expected.

In master branch:

/home/vortex/.local/lib/python3.4/site-packages/sklearn/neighbors/graph.py:36: DeprecationWarning: The behavior of 'kneighbors_graph' when mode='connectivity' will change in version 0.18. Presently, the nearest neighbor of each sample is the sample itself. Beginning in version 0.18, the default behavior will be to exclude each sample from being its own nearest neighbor. To maintain the current behavior, set include_self=True.
  "behavior, set include_self=True.", DeprecationWarning)
/home/vortex/.local/lib/python3.4/site-packages/sklearn/neighbors/graph.py:36: DeprecationWarning: The behavior of 'kneighbors_graph' when mode='connectivity' will change in version 0.18. Presently, the nearest neighbor of each sample is the sample itself. Beginning in version 0.18, the default behavior will be to exclude each sample from being its own nearest neighbor. To maintain the current behavior, set include_self=True.
  "behavior, set include_self=True.", DeprecationWarning)
/home/vortex/.local/lib/python3.4/site-packages/sklearn/manifold/spectral_embedding_.py:215: UserWarning: Graph is not fully connected, spectral embedding may not work as expected.
  warnings.warn("Graph is not fully connected, spectral embedding"
/home/vortex/.local/lib/python3.4/site-packages/sklearn/cluster/hierarchical.py:205: UserWarning: the number of connected components of the connectivity matrix is 2 > 1. Completing it to avoid stopping the tree early.
  connectivity, n_components = _fix_connectivity(X, connectivity)
/home/vortex/.local/lib/python3.4/site-packages/sklearn/cluster/hierarchical.py:440: UserWarning: the number of connected components of the connectivity matrix is 2 > 1. Completing it to avoid stopping the tree early.
  connectivity, n_components = _fix_connectivity(X, connectivity)
/home/vortex/.local/lib/python3.4/site-packages/sklearn/neighbors/graph.py:36: DeprecationWarning: The behavior of 'kneighbors_graph' when mode='connectivity' will change in version 0.18. Presently, the nearest neighbor of each sample is the sample itself. Beginning in version 0.18, the default behavior will be to exclude each sample from being its own nearest neighbor. To maintain the current behavior, set include_self=True.
  "behavior, set include_self=True.", DeprecationWarning)
/home/vortex/.local/lib/python3.4/site-packages/sklearn/manifold/spectral_embedding_.py:215: UserWarning: Graph is not fully connected, spectral embedding may not work as expected.
  warnings.warn("Graph is not fully connected, spectral embedding"
/home/vortex/.local/lib/python3.4/site-packages/sklearn/cluster/hierarchical.py:205: UserWarning: the number of connected components of the connectivity matrix is 3 > 1. Completing it to avoid stopping the tree early.
  connectivity, n_components = _fix_connectivity(X, connectivity)
/home/vortex/.local/lib/python3.4/site-packages/sklearn/cluster/hierarchical.py:440: UserWarning: the number of connected components of the connectivity matrix is 3 > 1. Completing it to avoid stopping the tree early.
  connectivity, n_components = _fix_connectivity(X, connectivity)
/home/vortex/.local/lib/python3.4/site-packages/sklearn/neighbors/graph.py:36: DeprecationWarning: The behavior of 'kneighbors_graph' when mode='connectivity' will change in version 0.18. Presently, the nearest neighbor of each sample is the sample itself. Beginning in version 0.18, the default behavior will be to exclude each sample from being its own nearest neighbor. To maintain the current behavior, set include_self=True.
  "behavior, set include_self=True.", DeprecationWarning)

In feature branch:

/home/vortex/.local/lib/python3.4/site-packages/sklearn/manifold/spectral_embedding_.py:213: UserWarning: Graph is not fully connected, spectral embedding may not work as expected.
  warnings.warn("Graph is not fully connected, spectral embedding"
/home/vortex/.local/lib/python3.4/site-packages/sklearn/cluster/hierarchical.py:200: UserWarning: the number of connected components of the connectivity matrix is 2 > 1. Completing it to avoid stopping the tree early.
  n_components=n_components)
/home/vortex/.local/lib/python3.4/site-packages/sklearn/cluster/hierarchical.py:429: UserWarning: the number of connected components of the connectivity matrix is 2 > 1. Completing it to avoid stopping the tree early.
  n_components=n_components)
/home/vortex/.local/lib/python3.4/site-packages/sklearn/manifold/spectral_embedding_.py:213: UserWarning: Graph is not fully connected, spectral embedding may not work as expected.
  warnings.warn("Graph is not fully connected, spectral embedding"
/home/vortex/.local/lib/python3.4/site-packages/sklearn/cluster/hierarchical.py:200: UserWarning: the number of connected components of the connectivity matrix is 3 > 1. Completing it to avoid stopping the tree early.
  n_components=n_components)
/home/vortex/.local/lib/python3.4/site-packages/sklearn/cluster/hierarchical.py:429: UserWarning: the number of connected components of the connectivity matrix is 3 > 1. Completing it to avoid stopping the tree early.
  n_components=n_components)

@amueller
Copy link
Member
amueller commented Mar 2, 2015

Oh right, so we didn't actually change behavior. Then 👍 for merge.

@amueller amueller changed the title Fixes issue #4304: SpectralClustering should be explicit about include_self [MRG + 1] Fixes issue #4304: SpectralClustering should be explicit about include_self Mar 2, 2015
@amueller amueller added this to the 0.16 milestone Mar 2, 2015
@ogrisel
Copy link
Member
ogrisel commented Mar 2, 2015

+1 as well.

ogrisel added a commit that referenced this pull request Mar 2, 2015
[MRG + 1] Fixes issue #4304: SpectralClustering should be explicit about include_self
@ogrisel ogrisel merged commit 9e20fb6 into scikit-learn:master Mar 2, 2015
@vinayak-mehta vinayak-mehta deleted the spectral_clustering branch September 11, 2015 06:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants
0