The protein holography package implements efficient rotationally-equivariant encoding of protein structure and minimal rotationally-equivariant processing of protein microenvironements via H-CNN.
Create the protein holography conda environment by running
conda env create -f env.yml
to install the necessary dependencies. Then run
pip install .
to install the protein_holography
package.
If you're going to make edits to the protein_holography
package, run
pip install -e .
so you can test your changes.
The pdb preprocessing module filters pdbs by criteria such as imaging type (e.g., X-ray crystallography, cryo-EM, etc.), resolution, date of deposition, or any other metadata deposited with the structure.
The coordinate module features all preprocessing of pdb files and ultimately results in the holograms that are used in the H-CNN. Specifically, pdb files are processed in three steps.
First, pdb files are read into pyrosetta where hydrogen atom positions are inferred, partial charges are assigned, and solvent-accessible surface area (SASA) is calculated on a per atom basis.
Second, neighborhoods of a fixed radius are extracted from each structure.
Third, each neighborhood is projected into Fourier space via the 3D zernike polynomials.
The hnn class is a fully fourier neural network coded in tensorflow and operates on fully complex inputs.