tomodrgn.analysis#

Functions for analysis of particle metadata: index, pose, ctf, latent embedding, label, tomogram spatial context, etc.

Functions

cluster_gmm

Cluster latent embeddings using a K-component full covariance Gaussian mixture model.

cluster_kmeans

Cluster latent embeddings using k-means clustering.

combine_ind

Combine multiple indices selections by either intersection or union.

convert_angstroms_to_voxels

Rescale dataframe coordinates from angstroms to unitless voxels corresponding to reconstructed tomograms.

convert_original_indices

Convert selected indices relative to a filtered particle stack into indices relative to the unfiltered particle stack.

get_colors_chimerax

Sample num_colors from the ChimeraX color scheme as RGBA tuples normalized [0,1].

get_colors_matplotlib

Sample num_colors colors from the specified color map as RGBA tuples

get_ind_for_cluster

Get the indices of particles belonging to the selected clusters.

get_nearest_point

Find the closest point in data to query.

get_pc_traj

Sample latent embeddings along specified principal component dim at coordininates in PC-space specified by sampling_points.

ipy_plot_interactive

Create and display an interactive plotly scatter plot and associated ipywidgets custom widgets, allowing exploration of numeric columns of a pandas dataframe.

ipy_tomo_ptcl_viewer

An interactive tomogram and particle viewer using plotly and ipywidgets.

load_dataframe

Merge known types of numpy arrays into a single pandas dataframe for downstream analysis.

parse_all_losses

Parse MSE, KLD, and total loss at each epoch from run.log output.

parse_loss

Parse total loss at each epoch from run.log output.

plot_by_cluster

Plot all points x,y with colors per class labels, with optional annotations for each cluster center and corresponding label.

plot_by_cluster_subplot

Plot all points x,y with colors per class labels on individual subplots for each of labels_sel.

plot_euler

Plot the distribution of Euler angles as a hexbin of theta and phi, and a histogram of psi.

plot_label_count_distribution

Plot the distribution of class labels per tomogram or micrograph as a heatmap.

plot_losses

Plot the total loss, reconstruction loss, and KLD divergence per epoch.

plot_projections

Plot a stack of grayscale images.

plot_three_column_correlation

Plot two reference vectors (e.g. l-UMAP1 and l-UMAP2) for potential correlation with a third query vector (e.g. CoordinateX, DefocusU, etc.).

plot_translations

Plot the distribution of shifts in x-axis vs shifts in y-axis (units: px)

recursive_load_dataframe

Create merged dataframe containing:

run_pca

Run principal component analysis on the latent embeddings.

run_tsne

Run t-SNE dimensionality reduction on latent embeddings.

run_umap

Run UMAP dimensionality reduction on latent embeddings.

scatter_annotate

Create a scatter plot with optional annotations for each cluster center and corresponding label.

scatter_annotate_hex

Create a hexbin plot with optional annotations for each cluster center and corresponding label.

scatter_color

Create a scatter plot colored by auto-mapped values of c according to specified cmap, and plot a corresponding colorbar.