tomodrgn analyze#

Purpose#

Run standard analyses of a train_vae model: dimensionality reduction and clustering of latent space, and generation of volumes from latent clustering and latent PCA via the tomoDRGN decoder for further analysis.

Sample usage#

The examples below are adapted from tomodrgn/testing/commandtest*.py, and rely on other outputs from commandtest.py to execute successfully.

# Warp v1 style inputs
tomodrgn \
    analyze \
    output/vae_both_sim_zdim8_dosetiltweightmask_batchsize8 \
    --ksample 20

# WarpTools style inputs
tomodrgn \
    analyze \
    output/vae_warptools_70S_zdim8_dosetiltweightmask_batchsize8 \
    --ksample 20

Arguments#

usage: analyze [-h] [--epoch EPOCH] [--device DEVICE] [-o OUTDIR] [--skip-vol]
               [--skip-umap] [--plot-format {png,svgz}] [--pc PC]
               [--pc-ondata] [--ksample KSAMPLE] [--downsample DOWNSAMPLE]
               [--lowpass LOWPASS] [--flip] [--invert]
               workdir

Positional Arguments#

workdir: Directory with tomoDRGN results

Named Arguments#

--epoch

Epoch number N to analyze (0-based indexing, corresponding to z.N.pkl, weights.N.pkl). Supplying latest will auto-detect the latest completed epoch of training.

Default: 'latest'

Core arguments#

--device

Optionally specify CUDA device

-o, --outdir

Output directory for analysis results (default: [workdir]/analyze.[epoch])

--skip-vol

Skip generation of volumes

Default: False

--skip-umap

Skip running UMAP

Default: False

--plot-format

Possible choices: png, svgz

File format with which to save plots

Default: 'png'

Arguments for latent space analysis#

--pc

Number of principal component traversals to generate (default: 2)

Default: 2

--pc-ondata

Find closest on-data latent point to each PC percentile

Default: False

--ksample

Number of kmeans samples to generate (default: 20)

Default: 20

Arguments for volume generation#

--downsample

Downsample volumes to this box size (pixels)

--lowpass

Lowpass filter to this resolution in Å

--flip

Flip handedness of output volumes

Default: False

--invert

Invert contrast of output volumes

Default: False

Common next steps#

Interactively explore correlations between and spatial context of star file parameters, latent embeddings, volume space dimensionality reduction in the generated Jupyter notebooks
Identify one (or more) sets of particle indices whose particles share a common feature (e.g. in latent space)
Filter the input star file by particle indices with tomodrgn filter_star
Generate an array of numeric labels describing a latent space property for each particle to color volumes in tomogram mapbacks with tomodrgn subtomo2chimerax

tomodrgn analyze#

Purpose#

Sample usage#

Arguments#

Positional Arguments#

Named Arguments#

Core arguments#

Arguments for latent space analysis#

Arguments for volume generation#

Common next steps#

This Page