tomodrgn eval_images#

Purpose#

Embed images to latent space using a pretrained train_vae model (i.e. evaluating encoder modules only).

Sample usage#

The examples below are adapted from tomodrgn/testing/commandtest*.py, and rely on other outputs from commandtest.py to execute successfully.

# Warp v1 style inputs
tomodrgn \
    eval_images data/10076_classE_32_sim.star \
    --source-software cryosrpnt \
    --weights output/vae_classE_sim_zdim8/weights.pkl \
    -c output/vae_classE_sim_zdim8/config.pkl \
    --out-z output/vae_classE_sim_zdim8/eval_images/z_all.pkl

# WarpTools style inputs
tomodrgn \
    eval_images \
    data/warptools_test_4-tomos_10-ptcls_box-32_angpix-12_optimisation_set.star \
    --weights output/vae_warptools_70S_zdim8/weights.pkl \
    -c output/vae_warptools_70S_zdim8/config.pkl \
    --out-z output/vae_warptools_70S_zdim8/eval_images/z_all.pkl

Arguments#

usage: eval_images [-h] [-w WEIGHTS] -c CONFIG --out-z PKL
                   [--log-interval LOG_INTERVAL] [-b BATCH_SIZE] [-v]
                   [--no-amp]
                   [--source-software {auto,warp,cryosrpnt,nextpyp,cistem,warptools,relion}]
                   [--ind-ptcls PKL] [--ind-imgs IND_IMGS]
                   [--sort-ptcl-imgs {unsorted,dose_ascending,random}]
                   [--use-first-ntilts USE_FIRST_NTILTS]
                   [--use-first-nptcls USE_FIRST_NPTCLS] [--datadir DATADIR]
                   [--lazy] [--uninvert-data] [--num-workers NUM_WORKERS]
                   [--prefetch-factor PREFETCH_FACTOR] [--pin-memory]
                   particles

Positional Arguments#

particles

Input particles (.mrcs, .star, or .txt)

Core arguments#

-w, --weights

Model weights

-c, --config

config.pkl file from train_vae

--out-z

Output pickle for z

--log-interval

Logging interval in N_IMGS

Default: 1000

-b, --batch-size

Minibatch size

Default: 64

-v, --verbose

Increases verbosity

Default: False

--no-amp

Disable use of automatic mixed precision

Default: False

Override configuration values – star file#

--source-software

Possible choices: auto, warp, cryosrpnt, nextpyp, cistem, warptools, relion

Manually set the software used to extract particles. Default is to auto-detect.

Default: 'auto'

--ind-ptcls

Filter starfile by particles (unique rlnGroupName values) using np array pkl as indices

--ind-imgs

Filter starfile by particle images (star file rows) using np array pkl as indices

--sort-ptcl-imgs

Possible choices: unsorted, dose_ascending, random

Sort the star file images on a per-particle basis by the specified criteria

Default: 'unsorted'

--use-first-ntilts

Keep the first use_first_ntilts images of each particle in the sorted star file.Default -1 means to use all. Will drop particles with fewer than this many tilt images.

Default: -1

--use-first-nptcls

Keep the first use_first_nptcls particles in the sorted star file. Default -1 means to use all.

Default: -1

Override configuration values – data handling#

--datadir

Path prefix to particle stack if loading relative paths from a .star or .cs file

--lazy

Lazy loading if full dataset is too large to fit in memory

Default: False

--uninvert-data

Do not invert data sign

Default: True

Dataloader arguments#

--num-workers

Number of workers to use when batching particles for training. Has moderate impact on epoch time

Default: 0

--prefetch-factor

Number of particles to prefetch per worker. Has moderate impact on epoch time

--pin-memory

Whether to use pinned memory for dataloader. Has large impact on epoch time. Recommended.

Default: False

Common next steps#

  • Analyze model at a particular epoch in latent space with tomodrgn analyze

  • Analyze model at a particular epoch in volume space with tomodrgn analyze_volumes

  • Generate volumes for all particles at a particular epoch with tomodrgn eval_vol

  • Map back generated volumes (for all particles) to source tomograms to explore spatially contextualized heterogeneity with tomodrgn subtomo2chimerax