tomodrgn convergence_vae#
Purpose#
Evaluate convergence of a train_vae model by stability of latent space and reconstructed volumes.
Sample usage#
The examples below are adapted from tomodrgn/testing/commandtest*.py, and rely on other outputs from commandtest.py to execute successfully.
# Warp v1 style inputs
tomodrgn \
    convergence_vae \
    output/vae_both_sim_zdim8_dosetiltweightmask_batchsize8 \
    --final-maxima 2 \
    --ground-truth data/10076_class*_32.mrc
# WarpTools style inputs
tomodrgn \
    convergence_vae \
    output/vae_warptools_70S_zdim8_dosetiltweightmask_batchsize8 \
    --final-maxima 2 \
    --ground-truth data/warptools_test_box-32_angpix-12_reconstruct.mrc
Arguments#
usage: convergence_vae [-h] [--epoch EPOCH] [-o OUTDIR]
                       [--epoch-interval EPOCH_INTERVAL]
                       [--plot-format {png,svgz}] [--subset SUBSET]
                       [--random-seed RANDOM_SEED]
                       [--random-state RANDOM_STATE] [--skip-umap]
                       [--n-bins N_BINS] [--smooth SMOOTH]
                       [--smooth-width SMOOTH_WIDTH]
                       [--pruned-maxima PRUNED_MAXIMA] [--radius RADIUS]
                       [--final-maxima FINAL_MAXIMA] [--downsample DOWNSAMPLE]
                       [--lowpass LOWPASS] [--flip] [--invert] [--skip-volgen]
                       [--ground-truth GROUND_TRUTH [GROUND_TRUTH ...]]
                       [--mask {none,sphere,tight,soft}] [--thresh THRESH]
                       [--dilate DILATE] [--dist DIST]
                       workdir
Positional Arguments#
- workdir
- Directory with tomoDRGN results 
Named Arguments#
- --epoch
- Latest epoch number N to analyze convergence (0-based indexing, corresponding to weights.N.pkl), “latest” for last detected epoch - Default: - 'latest'
Core arguments#
- -o, --outdir
- Output directory for convergence analysis results (default: [workdir]/convergence.[epoch]) 
- --epoch-interval
- Interval of epochs between calculating most convergence heuristics - Default: - 5
- --plot-format
- Possible choices: png, svgz - File format with which to save plots - Default: - 'png'
UMAP calculation arguments#
- --subset
- Max number of particles to be used for UMAP calculations. ‘None’ means use all ptcls - Default: - 50000
- --random-seed
- Manually specify the seed used for selection of subset particles and other numpy operations - Default: - 61995
- --random-state
- Random state seed used by UMAP for reproducibility at slight cost of performance (default 42, None means slightly faster but non-reproducible) - Default: - 42
- --skip-umap
- Skip UMAP embedding. Requires that UMAP be precomputed for downstream calcs. Useful for tweaking volume generation settings. - Default: - False
Sketching UMAP via local maxima arguments#
- --n-bins
- the number of bins along UMAP1 and UMAP2 - Default: - 30
- --smooth
- smooth the 2D histogram before identifying local maxima - Default: - True
- --smooth-width
- width of gaussian kernel for smoothing 2D histogram expressed as multiple of one bin’s width - Default: - 1.0
- --pruned-maxima
- prune poorly-separated maxima until this many maxima remain - Default: - 12
- --radius
- distance at which two maxima are considered poorly-separated and are candidates for pruning (euclidean distance in bin-space) - Default: - 5.0
- --final-maxima
- select this many local maxima, sorted by highest bin count after pruning, for which to generate volumes - Default: - 10
Volume generation arguments#
- --downsample
- Downsample volumes to this box size (pixels) 
- --lowpass
- Lowpass filter to this resolution in Å 
- --flip
- Flip handedness of output volumes - Default: - False
- --invert
- Invert contrast of output volumes - Default: - False
- --skip-volgen
- Skip volume generation. Requires that volumes already exist for downstream CC + FSC calculations - Default: - False
- --ground-truth
- Relative path containing wildcards to ground_truth_vols*.mrc for map-map CC calcs 
Mask generation arguments#
- --mask
- Possible choices: none, sphere, tight, soft - Type of mask to generate for each volume when calculating volume-based metrics. - Default: - 'soft'
- --thresh
- Isosurface percentile at which to threshold volume; default is to use 99th percentile. 
- --dilate
- Number of voxels to dilate thresholded isosurface outwards from mask boundary; default is to use 1/30th of box size (px). 
- --dist
- Number of voxels over which to apply a soft cosine falling edge from dilated mask boundary; default is to use 1/30th of box size (px) 
Common next steps#
- Extend model training with - tomodrgn train_vae [...] --load latestif not yet converged
- Analyze model at a particular epoch in latent space with - tomodrgn analyze
- Analyze model at a particular epoch in volume space with - tomodrgn analyze_volumes