8000 new feature: add clusterQR method to 'kmeans' and 'discretize' in spectral clustering · Issue #12164 · scikit-learn/scikit-learn · GitHub
[go: up one dir, main page]

8000
Skip to content
new feature: add clusterQR method to 'kmeans' and 'discretize' in spectral clustering  #12164
Closed
@lobpcg

Description

@lobpcg

Description

http://scikit-learn.org/stable/modules/generated/sklearn.cluster.spectral_clustering.html generates clustering labels using one of the two methods determined by assign_labels = 'kmeans' or 'discretize' from embedding computed from diffusion_map in scikit-learn/sklearn/manifold/spectral_embedding_.py

There is a nice simple new algorithm, called clusterQR, described in https://github.com/asdamle/QR-spectral-clustering giving 100% correct results in https://doi.org/10.1109/HPEC.2017.8091045 or https://arxiv.org/abs/1708.07481. clusterQR costs about the same or less as 'kmeans' and 'discretize', but may be expected to outperform both when the number of clusters is not small.

I suggest adding clusterQR to the scikit-learn code base. The function itself is <10 lines, plus a few changes in documentation and the spectral clustering function that calls it, so extra maintenance efforts are tiny. It may become the new default instead of kmeans, since it produces better quality partitions at similar memory footprint and compute time.

Steps/Code to Reproduce

N/A

Expected Results

clusterQR available

Actual Results

clusterQR not available

Versions

the most recent

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions

      0