8000 DOC fix docstring of AgglomerativeClustering based on sklearn guideli… · ritchieng/scikit-learn@893a4d4 · GitHub
[go: up one dir, main page]

Skip to content

Commit 893a4d4

Browse files
vachandaglemaitre
authored andcommitted
DOC fix docstring of AgglomerativeClustering based on sklearn guideline (scikit-learn#15764)
1 parent d5c6c96 commit 893a4d4

File tree

1 file changed

+20
-18
lines changed

1 file changed

+20
-18
lines changed

sklearn/cluster/_hierarchical.py

Lines changed: 20 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -683,41 +683,43 @@ class AgglomerativeClustering(ClusterMixin, BaseEstimator):
683683
684684
Parameters
685685
----------
686-
n_clusters : int or None, optional (default=2)
686+
n_clusters : int or None, default=2
687687
The number of clusters to find. It must be ``None`` if
688688
``distance_threshold`` is not ``None``.
689689
690-
affinity : string or callable, default: "euclidean"
690+
affinity : str or callable, default='euclidean'
691691
Metric used to compute the linkage. Can be "euclidean", "l1", "l2",
692692
"manhattan", "cosine", or "precomputed".
693693
If linkage is "ward", only "euclidean" is accepted.
694694
If "precomputed", a distance matrix (instead of a similarity matrix)
695695
is needed as input for the fit method.
696696
697-
memory : None, str or object with the joblib.Memory interface, optional
697+
memory : str or object with the joblib.Memory interface, default=None
698698
Used to cache the output of the computation of the tree.
699699
By default, no caching is done. If a string is given, it is the
700700
path to the caching directory.
701701
702-
connectivity : array-like or callable, optional
702+
connectivity : array-like or callable, default=None
703703
Connectivity matrix. Defines for each sample the neighboring
704704
samples following a given structure of the data.
705705
This can be a connectivity matrix itself or a callable that transforms
706706
the data into a connectivity matrix, such as derived from
707707
kneighbors_graph. Default is None, i.e, the
708708
hierarchical clustering algorithm is unstructured.
709709
710-
compute_full_tree : bool or 'auto' (optional)
711-
Stop early the construction of the tree at n_clusters. This is
712-
useful to decrease computation time if the number of clusters is
713-
not small compared to the number of samples. This option is
714-
useful only when specifying a connectivity matrix. Note also that
715-
when varying the number of clusters and using caching, it may
716-
be advantageous to compute the full tree. It must be ``True`` if
717-
``distance_threshold`` is not ``None``.
718-
719-
linkage : {"ward", "complete", "average", "single"}, optional \
720-
(default="ward")
710+
compute_full_tree : 'auto' or bool, default='auto'
711+
Stop early the construction of the tree at n_clusters. This is useful
712+
to decrease computation time if the number of clusters is not small
713+
compared to the number of samples. This option is useful only when
714+
specifying a connectivity matrix. Note also that when varying the
715+
number of clusters and using caching, it may be advantageous to compute
716+
the full tree. It must be ``True`` if ``distance_threshold`` is not
717+
``None``. By default `compute_full_tree` is "auto", which is equivalent
718+
to `True` when `distance_threshold` is not `None` or that `n_clusters`
719+
is inferior to 100 or `0.02 * n_samples`. Otherwise, "auto" is
720+
equivalent to `False`.
721+
722+
linkage : {"ward", "complete", "average", "single"}, default="ward"
721723
Which linkage criterion to use. The linkage criterion determines which
722724
distance to use between sets of observation. The algorithm will merge
723725
the pairs of cluster that minimize this criterion.
@@ -730,7 +732,7 @@ class AgglomerativeClustering(ClusterMixin, BaseEstimator):
730732
- single uses the minimum of the distances between all observations
731733
of the two sets.
732734
733-
distance_threshold : float, optional (default=None)
735+
distance_threshold : float, default=None
734736
The linkage distance threshold above which, clusters will not be
735737
merged. If not ``None``, ``n_clusters`` must be ``None`` and
736738
``compute_full_tree`` must be ``True``.
@@ -744,7 +746,7 @@ class AgglomerativeClustering(ClusterMixin, BaseEstimator):
744746
``distance_threshold=None``, it will be equal to the given
745747
``n_clusters``.
746748
747-
labels_ : array [n_samples]
749+
labels_ : ndarray of shape (n_samples)
748750
cluster labels for each point
749751
750752
n_leaves_ : int
@@ -753,7 +755,7 @@ class AgglomerativeClustering(ClusterMixin, BaseEstimator):
753755
n_connected_components_ : int
754756
The estimated number of connected components in the graph.
755757
756-
children_ : array-like, shape (n_samples-1, 2)
758+
children_ : array-like of shape (n_samples-1, 2)
757759
The children of each non-leaf node. Values less than `n_samples`
758760
correspond to leaves of the tree which are the original samples.
759761
A node `i` greater than or equal to `n_samples` is a non-leaf

0 commit comments

Comments
 (0)
0