Skip to content

Half-synthetic cryoEM micrograph generator that projects pdbs>mrcs onto existing buffer cryoEM micrographs

License

Notifications You must be signed in to change notification settings

alexjnoble/VirtualIce

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

DOI

VirtualIce: Half-synthetic CryoEM Micrograph Generator

VirtualIce is a feature-rich half-synthetic cryoEM micrograph generator that uses buffer cryoEM micrographs with junk and carbon masked out as real background. It projects PDB, EMDB, or local structures onto buffer cryoEM micrographs, simulating realistic imaging conditions by adding noise, dose damage, and applying CTF to particles. It outputs particle coordinates after masking out junk. It outputs particles if requested.

Release Notes

v2.0.0 - September 27, 2024

Features

  • Various bug fixes and version release.
v2.0.0beta - September 19, 2024

  • Multiple structures per micrograph can now be requested (structure sets).
  • Use the same --structures flag followed by either a single structure or multiple. Supports any number of structure sets, like this:
    • virtualice.py -s 1TIM [1PMA, 50882] [my_structure1.mrc, 3DRE, 6TIM]
  • The above command will make one set of micrographs with only PDB 1TIM, another set with PDB 1PMA and EMD-50882, and another set with a local file (my_structure1.mrc), PDB 3DRE, and PDB 6TIM.
  • Preferred orientation, particle distributions, and overlapping & aggregated particles are fully supported.
  • Filtering of edge, overlapping, and obscured particles is fully supported.
  • Coordinate files are saved independently in .star, .mod, and/or .coord files (one per structure in a structure set).
  • This update is significant because it allows for ground-truth datasets of heterogeneous proteins - e.g. continuous or discrete conformations, compositional heterogeneity, or completely different proteins.

v1.0.1 - September 19, 2024

  • Last release of VirtualIce for single-structure micrographs. Contains minor printout updates compared to v1.0.0.

v1.0.0 - September 10, 2024

  • Generates half-synthetic cryoEM micrographs and particles from buffer images and PDB IDs, EMDB IDs, or local files.
  • Creates coordinate files (.star, .mod, .coord), not including particles obscured by junk/substrate or too close to the edge.
  • Adds Poisson noise and dose-dependent damage to simulated frames and Gaussian noise to particles.
  • Applies the Contrast Transfer Function (CTF) to simulate microscope optics.
  • Control over overlapping particles and particle aggregation.
  • Outputs micrographs in MRC, PNG, and JPEG formats, and optionally cropped particles as MRCs.
  • Multi-core and GPU processing.
  • Extensive customization options including particle distribution, ice thickness, microscope parameters, and downsampling.

Requirements and Installation

VirtualIce requires Python 3, EMAN2, IMOD, and several dependencies, which can be installed using pip:

pip install cupy gpustat mrcfile numpy opencv-python pandas scipy SimpleITK

To use VirtualIce, clone the github directory, make virtualice.py executable (chmod +x virtualice.py), download the ice_images/ directory from EMPIAR-12287 to the VirtualIce/ice_images/ directory, and ensure virtualice.py is in your environment for use.

Examples

Example 1 (top-left: PDB 1TIM, 53 kDa, top-right: [1DAT, 442 kDa, 7ZP8, 1385 kDa, and 2HCO, 65 kDa], bottom-left: 1RUZ, 165 kDa, bottom-right: 1PMA, 686 kDa):

Example 1

Example 2 (T20S Proteasome; PDB 1PMA, 686 kDa):

Example 2

Example 3 (Triose Phosphate Isomerase; PDB 1TIM, 53 kDa):

Example 3

Example 4 (TRPV5; EMD 0594, 306 kDa):

Example 4

User Guide

The script can be run from the command line and takes a number of arguments.

Basic example usage:

virtualice.py -s 1TIM -n 10

Generates -n 10 random micrographs of PDB -s 1TIM.

Arguments:

  • -s, --structures: Specify PDB ID(s), EMDB ID(s), local files, and/or 'r' for random PDB/EMDB structures.
  • -n, --num_images: Number of micrographs to generate.

Basic example usage:

virtualice.py -s [1TIM, 11638] 1PMA -n 50

Generates -n 50 random micrographs for the structure set consisting of -s PDB 1TIM and EMDB-11638 (multi-structure micrographs), and -n 50 random micrographs of -s PDB 1PMA (single-structure micrographs).

Advanced example usage:

virtualice.py -s 1TIM r my_structure.mrc 11638 -n 3 -I -P -J -Q 90 -b 4 -D n -ps 2

Generates -n 3 random micrographs of PDB -s 1TIM, a random EMDB/PDB structure, a local structure called my_structure.mrc, and EMD-11638. Outputs an -I IMOD .mod coordinate file, -P png, and -J jpeg (quality -Q 90) for each micrograph, and bins -b all images by 4. Uses a -D non-random distribution of particles and -ps parallelizes micrograph generation across 2 CPUs.

Arguments:

  • -I, --imod_coordinate_file: Also output one IMOD .mod coordinate file per micrograph.
  • -P, --png: Output in PNG format.
  • -J, --jpeg: Output in JPEG format.
  • -Q, --jpeg-quality: JPEG image quality.
  • -b, --binning: Bin micrographs by downsampling.
  • -D, --distribution: Distribution type for generating particle locations.
  • -ps, --parallellize_structures: Parallel processes for micrograph generation across the structures requested.

Advanced example usage:

virtualice.py -s 1PMA -n 5 -om preferred -pw 0.9 -pa [*,90,0] [90 180 *] -aa l h r -ne --use_cpu -V 2 -3

Generates -n 5 random micrographs of PDB -s 1PMA (proteasome) with -om preferred orientation for -pw 90% (0.9) of particles. The preferred orientations are defined by random selections of -pa [*,90,0] (free to rotate along the first Z axis, then rotate 90 degrees in Y, do not rotate in Z) and [90 180 0] (rotate 90 degrees along the first Z axis, then rotate 180 degrees in Y, then free to rotate along the resulting Z). The -aa aggregation amount is chosen from _l_ow and _h_igh values _r_andomly for each of the 5 micrographs. -ne Edge particles are not included. --use_cpu Only CPUs are used (no GPUs). Terminal -V verbosity is set to 2 (verbose). The resulting micrographs are opened with -3 3dmod after generation.

Arguments:

  • -om, --orientation_mode: Orientation mode for projections.
  • -pw, --preferred_weight: Weight of the preferred orientations in the range [0, 1].
  • -pa, --preferred_angles: List of sets of three Euler angles (in degrees) for preferred orientations.
  • -aa, --aggregation_amount: Amount of particle aggregation.
  • -ne, --no_edge_particles: Prevent particles from being placed up to the edge of the micrograph.
  • --use_cpu, Use CPU for processing instead of GPU.
  • -V, --verbosity: Set verbosity level.
  • -3, --view_in_3dmod: View generated micrographs in 3dmod at the end of the run.

Additional arguments exist for fine-tuning the generation process including ice thickness, junk filtering, and CTF parameters.

Ethical Use Agreement

VirtualIce is under the MIT License, allowing broad usage freedom, but it should be used responsibly and ethically via these guidelines:

Intended Use

VirtualIce is designed for educational and research purposes, specifically to aid in the development, testing, and validation of cryoEM image analysis algorithms by generating half-synthetic cryoEM micrographs and particles.

Ethical Considerations

  • Transparency: Any data generated using VirtualIce should be clearly marked as synthetic when published or shared to distinguish it from real experimental data.
  • No Misrepresentation: Users should not present synthetic data generated by VirtualIce as real data from physical experiments in any publications or presentations unless explicitly stated.
  • Research Integrity: Users are encouraged to uphold the highest standards of scientific integrity in their work, ensuring that the use of synthetic data does not mislead, deceive, or otherwise harm the scientific community or the public.

Issues and Support

If you encounter any problems or have any questions about the script, please Submit an Issue.

Contributions

Contributions are welcome! Please open a Pull Request or Issue.

Reference

For more details about the VirtualIce algorithm and its applications, see and reference the associated manuscript: https://doi.org/10.1101/2024.09.28.615520

Author

This script was written by Alex J. Noble with assistance from OpenAI's GPT, Anthropic's Claude, and Google's Gemini models, 2023-2024 at SEMC.

License

This project is licensed under the MIT License - see the LICENSE file for details. The Ethical Use Agreement is compatible with the MIT License, providing guidelines and recommendations for responsible use without legally restricting software use.

About

Half-synthetic cryoEM micrograph generator that projects pdbs>mrcs onto existing buffer cryoEM micrographs

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages