Description
Description
http://scikit-learn.org/stable/modules/generated/sklearn.cluster.spectral_clustering.html generates clustering labels using one of the two methods determined by assign_labels = 'kmeans' or 'discretize' from embedding computed from diffusion_map in scikit-learn/sklearn/manifold/spectral_embedding_.py
There is a nice simple new algorithm, called clusterQR, described in https://github.com/asdamle/QR-spectral-clustering giving 100% correct results in https://doi.org/10.1109/HPEC.2017.8091045 or https://arxiv.org/abs/1708.07481. clusterQR costs about the same or less as 'kmeans' and 'discretize', but may be expected to outperform both when the number of clusters is not small.
I suggest adding clusterQR to the scikit-learn code base. The function itself is <10 lines, plus a few changes in documentation and the spectral clustering function that calls it, so extra maintenance efforts are tiny. It may become the new default instead of kmeans, since it produces better quality partitions at similar memory footprint and compute time.
Steps/Code to Reproduce
N/A
Expected Results
clusterQR available
Actual Results
clusterQR not available
Versions
the most recent