-
Notifications
You must be signed in to change notification settings - Fork 0
gaoqiweng/RediscMol
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
Benchmark: chembl_29_ecfp4_0.333.csv/smi is the pretraining dataset. The kinase and GPCR datasets have their corresponding pretrain datasets. cdk contains 10 random 10% fine-tuning and the target sets of CDK. cdk_100 contains 100 random 1% fine-tuning and the target sets of CDK. Scripts: evaluation.py is used to calculate the metrics whose environment requires Pytorch and RDKit. usage: evaluation.py [-h] [--train_path TRAIN_PATH] [--goal_path GOAL_PATH] [--gen_path GEN_PATH] [--n_jobs N_JOBS] [--metrics_path METRICS_PATH] optional arguments: -h, --help show this help message and exit --train_path TRAIN_PATH Path to fine-tuning molecules csv --goal_path GOAL_PATH Path to target molecules csv --gen_path GEN_PATH Path to generated molecules csv --n_jobs N_JOBS Number of threads --metrics_path METRICS_PATH Path to output file with metrics Example: python evaluation.py --train_path ../Benchmark/kinase/cdk/cdk_1_train.csv --goal_path ../Benchmark/kinase/cdk/cdk_1_goal.csv --gen_path generated_molecules.csv --n_jobs 48 --metrics_path metrics.csv The molecules of the generated_molecules.csv have to be valid, unique and novel.
About
No description, website, or topics provided.
Resources
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published