tomodrgn.analysis.cluster_gmm#

cluster_gmm(z: ndarray, n_components: int, random_state: int | RandomState | None = None, on_data: bool = True, **kwargs: Any) tuple[ndarray, ndarray][source]#

Cluster latent embeddings using a K-component full covariance Gaussian mixture model.

Parameters:
  • z – array of latent embeddings, shape (nptcls, zdim)

  • n_components – number of components to use in GMM, passed to sklearn.mixture.GaussianMixture

  • random_state – random state for reproducible runs, passed to sklearn.cluster.KMeans

  • on_data – adjust cluster centers to nearest point on the data manifold z

  • kwargs – additional key word arguments passed to sklearn.mixture.GaussianMixture

Returns:

array of cluster labels shape (len(z)), array of cluster centers shape (n_clusters, zdim)