tomodrgn analyze_volumes#
Purpose#
Run standard volume-space analyses of a train_vae
model: dimensionality reduction and clustering of a volume ensemble.
Sample usage#
The examples below are adapted from tomodrgn/testing/commandtest*.py
, and rely on other outputs from commandtest.py
to execute successfully.
# Warp v1 style inputs
tomodrgn \
analyze_volumes \
--voldir output/vae_both_sim_zdim2/eval_vol_allz \
--config output/vae_both_sim_zdim2/config.pkl \
--outdir output/vae_both_sim_zdim2/eval_vol_allz_analyze_volumes_mask_soft \
--ksample 20 \
--mask soft
# WarpTools style inputs
tomodrgn \
analyze_volumes \
--voldir output/vae_warptools_70S_zdim2/eval_vol_allz \
--config output/vae_warptools_70S_zdim2/config.pkl \
--outdir output/vae_warptools_70S_zdim2/eval_vol_allz_analyze_volumes_mask_soft \
--ksample 20 \
--mask soft
Arguments#
usage: analyze_volumes [-h] --voldir VOLDIR --config CONFIG [--outdir OUTDIR]
[--num-pcs NUM_PCS] [--ksample KSAMPLE]
[--plot-format {png,svgz}] [--mask-path MASK_PATH]
[--mask {none,sphere,tight,soft}] [--thresh THRESH]
[--dilate DILATE] [--dist DIST]
Core arguments#
- --voldir
path to directory containing volumes to analyze
- --config
path to train_vae config file
- --outdir
path to directory to save outputs. Default is same directory and basename as voldir, appended with analyze_volumes
- --num-pcs
keep this many PCs when saving PCA and running UMAP
Default:
128
- --ksample
Number of kmeans samples to generate (clustering voxel-PCA space). Note that this is only recommended if all particles in the dataset have had volumes generated in –voldir, to avoid confusion of k-means origin in latent space clustering and/or volume space clustering.
- --plot-format
Possible choices: png, svgz
File format with which to save plots
Default:
'png'
Mask generation arguments#
- --mask-path
Supply a custom real space mask instead of having tomoDRGN calculate a mask.
- --mask
Possible choices: none, sphere, tight, soft
Type of real space mask to generate for each volume when calculating voxel-PCA.Note that tight and soft masks are calculated uniquely per-volume.
- --thresh
Isosurface percentile at which to threshold volume; default is to use 99th percentile. Only relevant for tight and soft masks.
- --dilate
Number of voxels to dilate thresholded isosurface outwards from mask boundary; default is to use 1/30th of box size (px). Only relevant for soft mask.
- --dist
Number of voxels over which to apply a soft cosine falling edge from dilated mask boundary; default is to use 1/30th of box size (px). Only relevant for soft mask.
Common next steps#
Interactively explore correlations between and spatial context of star file parameters, latent embeddings, volume space dimensionality reduction in the
tomodrgn analyze
Jupyter notebooksIdentify one (or more) sets of particle indices whose particles share a common feature (e.g. in volume space)
Filter the input star file by particle indices with
tomodrgn filter_star
Generate an array of numeric labels describing a volume space property for each particle to color volumes in tomogram mapbacks with
tomodrgn subtomo2chimerax