@@ -683,41 +683,43 @@ class AgglomerativeClustering(ClusterMixin, BaseEstimator):
683
683
684
684
Parameters
685
685
----------
686
- n_clusters : int or None, optional ( default=2)
686
+ n_clusters : int or None, default=2
687
687
The number of clusters to find. It must be ``None`` if
688
688
``distance_threshold`` is not ``None``.
689
689
690
- affinity : string or callable, default: " euclidean"
690
+ affinity : str or callable, default=' euclidean'
691
691
Metric used to compute the linkage. Can be "euclidean", "l1", "l2",
692
692
"manhattan", "cosine", or "precomputed".
693
693
If linkage is "ward", only "euclidean" is accepted.
694
694
If "precomputed", a distance matrix (instead of a similarity matrix)
695
695
is needed as input for the fit method.
696
696
697
- memory : None, str or object with the joblib.Memory interface, optional
697
+ memory : str or object with the joblib.Memory interface, default=None
698
698
Used to cache the output of the computation of the tree.
699
699
By default, no caching is done. If a string is given, it is the
700
700
path to the caching directory.
701
701
702
- connectivity : array-like or callable, optional
702
+ connectivity : array-like or callable, default=None
703
703
Connectivity matrix. Defines for each sample the neighboring
704
704
samples following a given structure of the data.
705
705
This can be a connectivity matrix itself or a callable that transforms
706
706
the data into a connectivity matrix, such as derived from
707
707
kneighbors_graph. Default is None, i.e, the
708
708
hierarchical clustering algorithm is unstructured.
709
709
710
- compute_full_tree : bool or 'auto' (optional)
711
- Stop early the construction of the tree at n_clusters. This is
712
- useful to decrease computation time if the number of clusters is
713
- not small compared to the number of samples. This option is
714
- useful only when specifying a connectivity matrix. Note also that
715
- when varying the number of clusters and using caching, it may
716
- be advantageous to compute the full tree. It must be ``True`` if
717
- ``distance_threshold`` is not ``None``.
718
-
719
- linkage : {"ward", "complete", "average", "single"}, optional \
720
- (default="ward")
710
+ compute_full_tree : 'auto' or bool, default='auto'
711
+ Stop early the construction of the tree at n_clusters. This is useful
712
+ to decrease computation time if the number of clusters is not small
713
+ compared to the number of samples. This option is useful only when
714
+ specifying a connectivity matrix. Note also that when varying the
715
+ number of clusters and using caching, it may be advantageous to compute
716
+ the full tree. It must be ``True`` if ``distance_threshold`` is not
717
+ ``None``. By default `compute_full_tree` is "auto", which is equivalent
718
+ to `True` when `distance_threshold` is not `None` or that `n_clusters`
719
+ is inferior to 100 or `0.02 * n_samples`. Otherwise, "auto" is
720
+ equivalent to `False`.
721
+
722
+ linkage : {"ward", "complete", "average", "single"}, default="ward"
721
723
Which linkage criterion to use. The linkage criterion determines which
722
724
distance to use between sets of observation. The algorithm will merge
723
725
the pairs of cluster that minimize this criterion.
@@ -730,7 +732,7 @@ class AgglomerativeClustering(ClusterMixin, BaseEstimator):
730
732
- single uses the minimum of the distances between all observations
731
733
of the two sets.
732
734
733
- distance_threshold : float, optional ( default=None)
735
+ distance_threshold : float, default=None
734
736
The linkage distance threshold above which, clusters will not be
735
737
merged. If not ``None``, ``n_clusters`` must be ``None`` and
736
738
``compute_full_tree`` must be ``True``.
@@ -744,7 +746,7 @@ class AgglomerativeClustering(ClusterMixin, BaseEstimator):
744
746
``distance_threshold=None``, it will be equal to the given
745
747
``n_clusters``.
746
748
747
- labels_ : array [ n_samples]
749
+ labels_ : ndarray of shape ( n_samples)
748
750
cluster labels for each point
749
751
750
752
n_leaves_ : int
@@ -753,7 +755,7 @@ class AgglomerativeClustering(ClusterMixin, BaseEstimator):
753
755
n_connected_components_ : int
754
756
The estimated number of connected components in the graph.
755
757
756
- children_ : array-like, shape (n_samples-1, 2)
758
+ children_ : array-like of shape (n_samples-1, 2)
757
759
The children of each non-leaf node. Values less than `n_samples`
758
760
correspond to leaves of the tree which are the original samples.
759
761
A node `i` greater than or equal to `n_samples` is a non-leaf
0 commit comments