VirtualIce is a feature-rich half-synthetic cryoEM micrograph generator that uses buffer cryoEM micrographs with junk and carbon masked out as real background. It projects PDB, EMDB, or local structures onto buffer cryoEM micrographs, simulating realistic imaging conditions by adding noise, dose damage, and applying CTF to particles. It outputs particle coordinates after masking out junk. It outputs particles if requested.
- Various bug fixes and version release.
v2.0.0beta - September 19, 2024
- Multiple structures per micrograph can now be requested (structure sets).
- Use the same --structures flag followed by either a single structure or multiple. Supports any number of structure sets, like this:
- virtualice.py -s 1TIM [1PMA, 50882] [my_structure1.mrc, 3DRE, 6TIM]
- The above command will make one set of micrographs with only PDB 1TIM, another set with PDB 1PMA and EMD-50882, and another set with a local file (my_structure1.mrc), PDB 3DRE, and PDB 6TIM.
- Preferred orientation, particle distributions, and overlapping & aggregated particles are fully supported.
- Filtering of edge, overlapping, and obscured particles is fully supported.
- Coordinate files are saved independently in .star, .mod, and/or .coord files (one per structure in a structure set).
- This update is significant because it allows for ground-truth datasets of heterogeneous proteins - e.g. continuous or discrete conformations, compositional heterogeneity, or completely different proteins.
v1.0.1 - September 19, 2024
- Last release of VirtualIce for single-structure micrographs. Contains minor printout updates compared to v1.0.0.
v1.0.0 - September 10, 2024
- Generates half-synthetic cryoEM micrographs and particles from buffer images and PDB IDs, EMDB IDs, or local files.
- Creates coordinate files (.star, .mod, .coord), not including particles obscured by junk/substrate or too close to the edge.
- Adds Poisson noise and dose-dependent damage to simulated frames and Gaussian noise to particles.
- Applies the Contrast Transfer Function (CTF) to simulate microscope optics.
- Control over overlapping particles and particle aggregation.
- Outputs micrographs in MRC, PNG, and JPEG formats, and optionally cropped particles as MRCs.
- Multi-core and GPU processing.
- Extensive customization options including particle distribution, ice thickness, microscope parameters, and downsampling.
VirtualIce requires Python 3, EMAN2, IMOD, and several dependencies, which can be installed using pip:
pip install cupy gpustat mrcfile numpy opencv-python pandas scipy SimpleITK
To use VirtualIce, clone the github directory, make virtualice.py executable (chmod +x virtualice.py
), download the ice_images/ directory from EMPIAR-12287 to the VirtualIce/ice_images/ directory, and ensure virtualice.py is in your environment for use.
Example 1 (top-left: PDB 1TIM, 53 kDa, top-right: [1DAT, 442 kDa, 7ZP8, 1385 kDa, and 2HCO, 65 kDa], bottom-left: 1RUZ, 165 kDa, bottom-right: 1PMA, 686 kDa):
The script can be run from the command line and takes a number of arguments.
virtualice.py -s 1TIM -n 10
Generates -n
10 random micrographs of PDB -s
1TIM.
Arguments:
-s
,--structures
: Specify PDB ID(s), EMDB ID(s), local files, and/or 'r' for random PDB/EMDB structures.-n
,--num_images
: Number of micrographs to generate.
virtualice.py -s [1TIM, 11638] 1PMA -n 50
Generates -n
50 random micrographs for the structure set consisting of -s
PDB 1TIM and EMDB-11638 (multi-structure micrographs), and -n
50 random micrographs of -s
PDB 1PMA (single-structure micrographs).
virtualice.py -s 1TIM r my_structure.mrc 11638 -n 3 -I -P -J -Q 90 -b 4 -D n -ps 2
Generates -n
3 random micrographs of PDB -s
1TIM, a random EMDB/PDB structure, a local structure called my_structure.mrc, and EMD-11638. Outputs an -I
IMOD .mod coordinate file, -P
png, and -J
jpeg (quality -Q
90) for each micrograph, and bins -b
all images by 4. Uses a -D
non-random distribution of particles and -ps
parallelizes micrograph generation across 2 CPUs.
Arguments:
-I
,--imod_coordinate_file
: Also output one IMOD .mod coordinate file per micrograph.-P
,--png
: Output in PNG format.-J
,--jpeg
: Output in JPEG format.-Q
,--jpeg-quality
: JPEG image quality.-b
,--binning
: Bin micrographs by downsampling.-D
,--distribution
: Distribution type for generating particle locations.-ps
,--parallellize_structures
: Parallel processes for micrograph generation across the structures requested.
virtualice.py -s 1PMA -n 5 -om preferred -pw 0.9 -pa [*,90,0] [90 180 *] -aa l h r -ne --use_cpu -V 2 -3
Generates -n
5 random micrographs of PDB -s
1PMA (proteasome) with -om
preferred orientation for -pw
90% (0.9) of particles. The preferred orientations are defined by random selections of -pa
[*,90,0] (free to rotate along the first Z axis, then rotate 90 degrees in Y, do not rotate in Z) and [90 180 0] (rotate 90 degrees along the first Z axis, then rotate 180 degrees in Y, then free to rotate along the resulting Z). The -aa
aggregation amount is chosen from _l_ow and _h_igh values _r_andomly for each of the 5 micrographs. -ne
Edge particles are not included. --use_cpu
Only CPUs are used (no GPUs). Terminal -V
verbosity is set to 2 (verbose). The resulting micrographs are opened with -3
3dmod after generation.
Arguments:
-om
,--orientation_mode
: Orientation mode for projections.-pw
,--preferred_weight
: Weight of the preferred orientations in the range [0, 1].-pa
,--preferred_angles
: List of sets of three Euler angles (in degrees) for preferred orientations.-aa
,--aggregation_amount
: Amount of particle aggregation.-ne
,--no_edge_particles
: Prevent particles from being placed up to the edge of the micrograph.--use_cpu
, Use CPU for processing instead of GPU.-V
,--verbosity
: Set verbosity level.-3
,--view_in_3dmod
: View generated micrographs in 3dmod at the end of the run.
Additional arguments exist for fine-tuning the generation process including ice thickness, junk filtering, and CTF parameters.
VirtualIce is under the MIT License, allowing broad usage freedom, but it should be used responsibly and ethically via these guidelines:
VirtualIce is designed for educational and research purposes, specifically to aid in the development, testing, and validation of cryoEM image analysis algorithms by generating half-synthetic cryoEM micrographs and particles.
- Transparency: Any data generated using VirtualIce should be clearly marked as synthetic when published or shared to distinguish it from real experimental data.
- No Misrepresentation: Users should not present synthetic data generated by VirtualIce as real data from physical experiments in any publications or presentations unless explicitly stated.
- Research Integrity: Users are encouraged to uphold the highest standards of scientific integrity in their work, ensuring that the use of synthetic data does not mislead, deceive, or otherwise harm the scientific community or the public.
If you encounter any problems or have any questions about the script, please Submit an Issue.
Contributions are welcome! Please open a Pull Request or Issue.
For more details about the VirtualIce algorithm and its applications, see and reference the associated manuscript: https://doi.org/10.1101/2024.09.28.615520
This script was written by Alex J. Noble with assistance from OpenAI's GPT, Anthropic's Claude, and Google's Gemini models, 2023-2024 at SEMC.
This project is licensed under the MIT License - see the LICENSE file for details. The Ethical Use Agreement is compatible with the MIT License, providing guidelines and recommendations for responsible use without legally restricting software use.