tomodrgn analyze#
Purpose#
Run standard analyses of a train_vae
model: dimensionality reduction and clustering of latent space, and generation of volumes from latent clustering and latent PCA via the tomoDRGN decoder for further analysis.
Sample usage#
The examples below are adapted from tomodrgn/testing/commandtest*.py
, and rely on other outputs from commandtest.py
to execute successfully.
# Warp v1 style inputs
tomodrgn \
analyze \
output/vae_both_sim_zdim8_dosetiltweightmask_batchsize8 \
--ksample 20
# WarpTools style inputs
tomodrgn \
analyze \
output/vae_warptools_70S_zdim8_dosetiltweightmask_batchsize8 \
--ksample 20
Arguments#
usage: analyze [-h] [--epoch EPOCH] [--device DEVICE] [-o OUTDIR] [--skip-vol]
[--skip-umap] [--plot-format {png,svgz}] [--pc PC]
[--pc-ondata] [--ksample KSAMPLE] [--downsample DOWNSAMPLE]
[--lowpass LOWPASS] [--flip] [--invert]
workdir
Positional Arguments#
- workdir
Directory with tomoDRGN results
Named Arguments#
- --epoch
Epoch number N to analyze (0-based indexing, corresponding to z.N.pkl, weights.N.pkl). Supplying latest will auto-detect the latest completed epoch of training.
Default:
'latest'
Core arguments#
- --device
Optionally specify CUDA device
- -o, --outdir
Output directory for analysis results (default: [workdir]/analyze.[epoch])
- --skip-vol
Skip generation of volumes
Default:
False
- --skip-umap
Skip running UMAP
Default:
False
- --plot-format
Possible choices: png, svgz
File format with which to save plots
Default:
'png'
Arguments for latent space analysis#
- --pc
Number of principal component traversals to generate (default: 2)
Default:
2
- --pc-ondata
Find closest on-data latent point to each PC percentile
Default:
False
- --ksample
Number of kmeans samples to generate (default: 20)
Default:
20
Arguments for volume generation#
- --downsample
Downsample volumes to this box size (pixels)
- --lowpass
Lowpass filter to this resolution in Å
- --flip
Flip handedness of output volumes
Default:
False
- --invert
Invert contrast of output volumes
Default:
False
Common next steps#
Interactively explore correlations between and spatial context of star file parameters, latent embeddings, volume space dimensionality reduction in the generated Jupyter notebooks
Identify one (or more) sets of particle indices whose particles share a common feature (e.g. in latent space)
Filter the input star file by particle indices with
tomodrgn filter_star
Generate an array of numeric labels describing a latent space property for each particle to color volumes in tomogram mapbacks with
tomodrgn subtomo2chimerax