HYDRA

This repository is the official implementation of the paper "A Hybrid Diffusion Model for Stable, Affinity-Driven, Receptor-Aware Peptide Generation".

Requirements

HYDRA was developed using PyTorch 1.13.1 on Python 3.8.18 which remain the preferred versions to reproduce the code. A virtual environment manager such as Conda is recommended.

Install dependencies:

conda env create -f environment.yml

Activate the environment:

conda activate HYDRA

Data

The model was trained on release-2020-03-18 of PepBDB.
It can be obtained by running:

wget http://huanglab.phys.hust.edu.cn/pepbdb/db/download/pepbdb-20200318.tgz

Scripts to clean and preprocess the dataset for usage with HYDRA is provided in the utils/datasets/ directory and can be used as follows:

python3 utils/datasets/clean_pepbdb.py --source /path/to/pepbdb --dest /path/to/pepbdb --n_atom_thr 200
python3 utils/datasets/process_pepbdb.py --source /path/to/pepbdb --dest /path/to/pepbdb_natoms200_pocket10 --radius 10

Pre-trained Model Checkpoint

You can download the model weights used in the paper here for inference.

Configuration

Configuration management for HYDRA is done through multiple .yml files located in the configs/ directory.

train.yml specifies the parameters for training the diffusion model along with path to the processed dataset.
sample.yml specifies parameters for sampling, including the number of peptides to sample for each target receptor and model weights to load.
reconstruct.yml specifies the reconstruction algorithm as well as its hyperparameters.

These configurations might have to be modified to point to the dataset and checkpoint path on your system before proceeding with training and evaluation.

Training

To train HYDRA, run:

python3 scripts/train.py configs/train.yml

Optionally, you may specify the parameter --num_gpus N in order to perform multi-GPU training using the Distributed Data Parallel strategy.
The parameter --ckpt can be used to resume training from an existing checkpoint.

Evaluation

Evaluating HYDRA must be done in two stages:

Sampling residues based on the target receptor.
Reconstructing generated residues into peptides.

1.1 Sampling for all receptors in the testset

python3 scripts/sample-testset.py configs/train.yml configs/sample.yml --out_dir ./outputs

1.2 Sampling for a receptor from PDB

python3 scripts/sample-pdb.py data/pfemp1/PF3D71150400_MEDIUM.pdb configs/train.yml configs/sample.yml --out_dir ./outputs

2. Reconstructing sampled residues into peptides

python3 scripts/reconstruct.py ./outputs configs/reconstruct.yml

Contributing

All source code within this repository is licensed under the MIT License, please see the LICENSE file for more details.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

HYDRA

Requirements

Data

Pre-trained Model Checkpoint

Configuration

Training

Evaluation

1.1 Sampling for all receptors in the testset

1.2 Sampling for a receptor from PDB

2. Reconstructing sampled residues into peptides

Contributing

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
assets		assets
configs		configs
data		data
datasets		datasets
models		models
scripts		scripts
slurm		slurm
utils		utils
weights		weights
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
environment.yml		environment.yml

License

ComputationalBiologyLab-IIITH/HYDRA

Folders and files

Latest commit

History

Repository files navigation

HYDRA

Requirements

Data

Pre-trained Model Checkpoint

Configuration

Training

Evaluation

1.1 Sampling for all receptors in the testset

1.2 Sampling for a receptor from PDB

2. Reconstructing sampled residues into peptides

Contributing

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages