ChromBPNet Pytorch

Pytorch implementation for ChromBPNet
Please refer to original code and paper ChromBPNet: Bias factorized, base-resolution deep learning models of chromatin accessibility reveal cis-regulatory sequence syntax, transcription factor footprints and regulatory variants by Anusri Pampari*, Anna Shcherbina*, Anshul Kundaje. (*authors contributed equally)
This repo also refers to bpnet-lite and uses tangermeme for interpretation, two very useful repos by Jacob Schreiber

Reproduce Official ChromBPNet performance

Pearson correlation on counts prediction of peaks

official chrombpnet (left) vs pytorch chrombpnet (right)

Attribution score

Here is the genome browser to compare the profile prediction and attribution scores between official ChromBPNet and pytorch implementation with n_filters = 512 and 128

Installation

Install from pypi

pip install chrombpnet-pytorch

Install from source

pip install git+https://github.com/jsxlei/chrombpnet-pytorch.git

QuickStart

Before training

Download the genome or use your own genome data
Download the ENCODE K562 ATAC data or use your own ATAC data

Bias-factorized ChromBPNet training

Please refer to data_config to define your own dataset or pass them through command.

if your <data_path> contains: peaks.bed, negatives.bed, and unstranded.bw, and bias_scaled.h5 as well.

chrombpnet train --data_dir <data_path>

Otherwise

chrombpnet train --peaks <peak_file> --negatives <negative_file> --bigwig <unstrand.bw> --bias <bias_scaled.h5> --adjust_bias

Predict with pretrained model in .h5 format or .cpkt or .pt

chrombpnet predict --data_dir <data_path> --checkpoint chrombpnet_wo_bias.h5/best_model.cpkt/chrombpnet_wo_bias.pt -o <output_path>

Interpret by calculating attribution

chrombpnet interpret --data_dir <data_path> --checkpoint <model_cpkt/model_h5> -o <output_path>

Run full pipeline including training, predicting and interpreting

chrombpnet --data_dir <data_path> -o <output_path>

Finetune model

chrombpnet finetune --data_dir <data_path> --checkpoint <model>.h5/pt -o <output_path>

Variant scoring

snp_score 
 	-l $snps \
 	-g $ref_fasta \
 	-pg $ref_fasta_peaks \
 	-s $chrom_sizes \
 	-ps $chrom_sizes_peaks \
 	-m $model \
 	-p $peaks \
 	-o $out_prefix \
 	-t 2 \
 	-li \
 	-sc chrombpnet

Input Format

--bigwig
--peaks
--negatives

Output Format

The ouput directory will be populated as follows with fold_0 chromosome splits -

fold_0\
	checkpoints\
		best_model.cpkt
		last.cpkt
		chrombpnet_nobias.pt (pytorch i.e model to predict bias corrected accessibility profile) 
	train.log
	predict.log
	evaluation\
		eval\
			all_regions.counts_pearsonr.png
			all_regions_jsd.profile_jsd.png  
			peaks.counts_pearsonr.png  
			peaks_jsd.profile_jsd.png  
			regions.csv
			metrics.json
	interpret\
		counts\

How to Cite

If you're using ChromBPNet in your work, please cite as follows:

@article {Pampari2024.12.25.630221,
	author = {Pampari, Anusri and Shcherbina, Anna and Kvon, Evgeny and Kosicki, Michael and Nair, Surag and Kundu, Soumya and Kathiria, Arwa S. and Risca, Viviana I. and Kuningas, Kristiina and Alasoo, Kaur and Greenleaf, William James and Pennacchio, Len A. and Kundaje, Anshul},
	title = {ChromBPNet: bias factorized, base-resolution deep learning models of chromatin accessibility reveal cis-regulatory sequence syntax, transcription factor footprints and regulatory variants},
	elocation-id = {2024.12.25.630221},
	year = {2024},
	doi = {10.1101/2024.12.25.630221},
	publisher = {Cold Spring Harbor Laboratory},
	URL = {https://www.biorxiv.org/content/early/2024/12/25/2024.12.25.630221},
	eprint = {https://www.biorxiv.org/content/early/2024/12/25/2024.12.25.630221.full.pdf},
	journal = {bioRxiv}
}

Name		Name	Last commit message	Last commit date
Latest commit History 39 Commits
chrombpnet		chrombpnet
images		images
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

ChromBPNet Pytorch

Reproduce Official ChromBPNet performance

Pearson correlation on counts prediction of peaks

Attribution score

Table of contents

Installation

Install from pypi

Install from source

QuickStart

Before training

Bias-factorized ChromBPNet training

Predict with pretrained model in .h5 format or .cpkt or .pt

Interpret by calculating attribution

Run full pipeline including training, predicting and interpreting

Finetune model

Variant scoring

Input Format

Output Format

How to Cite

About

Uh oh!

Releases

Packages

Languages

License

kundajelab/chrombpnet-pytorch

Folders and files

Latest commit

History

Repository files navigation

ChromBPNet Pytorch

Reproduce Official ChromBPNet performance

Pearson correlation on counts prediction of peaks

Attribution score

Table of contents

Installation

Install from pypi

Install from source

QuickStart

Before training

Bias-factorized ChromBPNet training

Predict with pretrained model in .h5 format or .cpkt or .pt

Interpret by calculating attribution

Run full pipeline including training, predicting and interpreting

Finetune model

Variant scoring

Input Format

Output Format

How to Cite

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages