ImagePLB 🐘

Official PyTorch-based implementation of Paper "An Image-based Protein-Ligand Binding Representation Learning Framework via Multi-Level Flexible Dynamics Trajectory Pre-training".

News!

[2024/06/28] Repository installation completed.

Environments

1. GPU environment

CUDA 11.6

Ubuntu 18.04

2. create conda environment

# create conda env
conda create -n ImagePLB python=3.9
conda activate ImagePLB
pip config set global.index-url https://pypi.tuna.tsinghua.edu.cn/simple

# install environment
pip install rdkit
pip install torch==1.13.1+cu116 torchvision==0.14.1+cu116 torchaudio==0.13.1 --extra-index-url https://download.pytorch.org/whl/cu116 -i https://pypi.tuna.tsinghua.edu.cn/simple

pip install biopython==1.79
pip install easydict
pip install tqdm
pip install timm==0.6.12
pip install tensorboard
pip install scikit-learn
pip install setuptools==59.5.0
pip install pandas
pip install torch-cluster torch-scatter torch-sparse torch-spline-conv -f https://pytorch-geometric.com/whl/torch-1.13.1%2Bcu116.html
pip install torch-geometric==1.6.0
pip install dgl-cu116
pip install ogb
pip install seaborn
conda install openbabel -c conda-forge
pip install einops

Data preprocessing

We use PyMOL to genearte multi-view ligand images from molecular conformations. Here is the PyMOL script to get the multi-view ligand images, you can run it in the PyMOL command:

Click here for the code!

sdf_filepath=demo.sdf  # sdf file path of ligand
rotate_direction=x
rotate=0
save_img_path=demo_frame.png
load $sdf_filepath;bg_color white;set stick_ball,on;set stick_ball_ratio,3.5;set stick_radius,0.15;set sphere_scale,0.2;set valence,1;set valence_mode,0;set valence_size, 0.1;rotate $rotate_direction, $rotate;save $save_img_path;quit;

Note that we used 4 views by setting the following parameters:

rotate_direction=x; rotate=0
rotate_direction=x; rotate=180
rotate_direction=y; rotate=180
rotate_direction=z; rotate=180

Of course, to save you time on data preprocessing, we also provide download links for all data for your free access.

Pre-Training ImagePLB

1. Pre-training Dataset

Name	Download link	Description
multi_view_trajectory_video.tar.gz	BaiduCloud	ligand trajectory with multi-view images.
pocket.tar.gz	OneDrive	pocket trajectory with 3D graphs.

Please download all data listed above and put it in datasets/pre-training/MISATO/processed/ if you want to pre-train ImagePLB from scratch.

The directory is organized in the following format:

datasets/pre-training/MISATO/processed/
+---pocket
|   |   train.npz
|   |
+---multi_view_trajectory_video
|   +---1A0Q
|   |   +---x_0
|   |   |   mov0001.png
|   |   |   mov0002.png
|   |   |   ...
|   |   +---x_180
|   |   |   mov0001.png
|   |   |   mov0002.png
|   |   |   ...
|   |   +---y_180
|   |   |   mov0001.png
|   |   |   mov0002.png
|   |   |   ...
|   |   +---z_180
|   |   |   mov0001.png
|   |   |   mov0002.png
|   |   |   ...

2. ❄️Direct access to pre-trained ImagePLB

The pre-trained ImagePLB (ImagePLB-P) can be accessed in following table.

Name	Download link	Description
ImagePLB-P.pth	OneDrive	You can download the ImagePLB-P and put it in the directory: `resumes/`.

3. 🔥Train your own ImagePLB-P from scratch

If you want to pre-train your own ImagePLB-P, see the command below.

Usage:

usage: pretrain_ImagePLB.py [-h] [--dataroot DATAROOT] [--workers WORKERS]
                            [--model_name MODEL_NAME]
                            [--max_len_pocket MAX_LEN_POCKET] [--center]
                            [--n_dim_graph N_DIM_GRAPH] [--lr LR]
                            [--momentum MOMENTUM]
                            [--weight-decay WEIGHT_DECAY] [--weighted_loss]
                            [--runseed RUNSEED] [--start_epoch START_EPOCH]
                            [--epochs EPOCHS] [--batch BATCH]
                            [--imageSize IMAGESIZE] [--resume RESUME]
                            [--n_ckpt_save N_CKPT_SAVE]
                            [--n_batch_step_optim N_BATCH_STEP_OPTIM]
                            [--lambda_next_mol LAMBDA_NEXT_MOL]
                            [--lambda_next_pocket LAMBDA_NEXT_POCKET]
                            [--lambda_next_complex LAMBDA_NEXT_COMPLEX]
                            [--log_dir LOG_DIR] [--tb_step_num TB_STEP_NUM]

run command in pretrain folder to pre-train ImagePLB:

CUDA_VISIBLE_DEVICES=0,1,2,3 python pretrain_ImagePLB.py \
	--workers 16 \
	--batch 128 \
	--epochs 30 \
	--lr 0.001 \
	--dataroot ../datasets/pre-training/MISATO/processed \
	--log_dir ./experiments/pretrain_ImagePLB \
	--weighted_loss

🔥Training ImagePLB on Downstream Tasks

All downstream task data is publicly accessible below:

Datasets	Links	Description
PDBBind	OneDrive	Including PDBBind-30, PDBBind-60, PDBBind-Scaffold.
LEP	OneDrive	Dataset of ligand efficacy prediction.

⚠️Please download the dataset provided above and organize the directory as follows:

datasets/fine-tuning/
+---pdbbind
|   +---ligand
|   |   +---1a4k
|   |   |   x_0.png
|   |   |   x_180.png
|   |   |   y_180.png
|   |   |   z_180.png
|   +---30
|   |   |   train.npz
|   |   |   valid.npz
|   |   |   test.npz
|   +---60
|   |   |   train.npz
|   |   |   valid.npz
|   |   |   test.npz
|   +---scaffold
|   |   |   train.npz
|   |   |   valid.npz
|   |   |   test.npz
+---lep
|   +---ligand
|   |   +---Lig2__6BQG__6BQH
|   |   |   x_0.png
|   |   |   x_180.png
|   |   |   y_180.png
|   |   |   z_180.png
|   +---protein
|   |   |   train.npz
|   |   |   val.npz
|   |   |   test.npz

run command in finetune folder for PDBBind:

python pdbbind.py \
	--batch 32 \
	--epochs 20 \
	--lr 0.0001 \
	--egnn_dropout 0.3 \
	--predictor_dropout 0.3 \
	--dataroot ../datasets/fine-tuning/pdbbind \
	--split_type scaffold \
	--resume ../resumes/ImagePLB-P.pth \
	--log_dir ./experiments/pdbbind/scaffold/rs0/ \
	--runseed 0 \
	--dist-url tcp://127.0.0.1:12312

run command in finetune folder for LEP：

python lep.py \
	--batch 32 \
	--epochs 100 \
	--lr 0.0001 \
	--dataroot ../datasets/fine-tuning/lep \
	--split_type protein \
	--egnn_dropout 0.5 \
	--predictor_dropout 0.5 \
	--resume ../resumes/ImagePLB-P.pth \
	--log_dir ./experiments/lep/rs0/ \
	--runseed 0 \
	--dist-url tcp://127.0.0.1:12345

💡Reproducing Our Results

We provide detailed training logs and corresponding checkpoints, you can easily see more training details from the logs and directly use our trained models for structure-based virtual screening.

Name	Download link	Description
PDBBind-30	OneDrive	The training details of ImagePLB-P on PDBBind-30
PDBBind-60	OneDrive	The training details of ImagePLB-P on PDBBind-60
PDBBind-Scaffold	OneDrive	The training details of ImagePLB-P on PDBBind-Scaffold
LEP	OneDrive	The training details of ImagePLB-P on LEP

The files include training logs and checkpoints for training ImagePLB-P with three random seeds (0, 1, 2).

Reference

If our paper or code is helpful to you, please do not hesitate to point a star for our repository and cite the following content.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
dataloader		dataloader
finetune		finetune
model		model
pretrain		pretrain
utils		utils
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ImagePLB 🐘

News!

Environments

1. GPU environment

2. create conda environment

Data preprocessing

Pre-Training ImagePLB

1. Pre-training Dataset

2. ❄️Direct access to pre-trained ImagePLB

3. 🔥Train your own ImagePLB-P from scratch

🔥Training ImagePLB on Downstream Tasks

💡Reproducing Our Results

Reference

About

Releases

Packages

Languages

HongxinXiang/ImagePLB

Folders and files

Latest commit

History

Repository files navigation

ImagePLB 🐘

News!

Environments

1. GPU environment

2. create conda environment

Data preprocessing

Pre-Training ImagePLB

1. Pre-training Dataset

2. ❄️Direct access to pre-trained ImagePLB

3. 🔥Train your own ImagePLB-P from scratch

🔥Training ImagePLB on Downstream Tasks

💡Reproducing Our Results

Reference

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages