Issue | #Downvotes for this reason | By |
---|
affinity | Metric used to compute the linkage. Can be "euclidean", "l1", "l2", "manhattan", "cosine", or 'precomputed' If linkage is "ward", only "euclidean" is accepted | default: "euclidean" |
compute_full_tree | Stop early the construction of the tree at n_clusters. This is useful to decrease computation time if the number of clusters is not small compared to the number of features. This option is useful only when specifying a connectivity matrix. Note also that when varying the number of clusters and using caching, it may be advantageous to compute the full tree linkage : {"ward", "complete", "average"}, optional, default "ward" Which linkage criterion to use. The linkage criterion determines which distance to use between sets of features. The algorithm will merge the pairs of cluster that minimize this criterion - ward minimizes the variance of the clusters being merged - average uses the average of the distances of each feature of the two sets - complete or maximum linkage uses the maximum distances between all features of the two sets | default: "auto" |
connectivity | Connectivity matrix. Defines for each feature the neighboring features following a given structure of the data This can be a connectivity matrix itself or a callable that transforms the data into a connectivity matrix, such as derived from kneighbors_graph. Default is None, i.e, the hierarchical clustering algorithm is unstructured | default: null |
linkage | default: "average" | |
memory | Used to cache the output of the computation of the tree By default, no caching is done. If a string is given, it is the path to the caching directory | default: null |
n_clusters | The number of clusters to find | default: 2 |
pooling_func | This combines the values of agglomerated features into a single value, and should accept an array of shape [M, N] and the keyword argument `axis=1`, and reduce it to an array of size [M]. | default: {"oml-python:serialized_object": "function", "value": "numpy.mean"} |