- Pytorch implementation for ChromBPNet
 - Please refer to original code and paper ChromBPNet: Bias factorized, base-resolution deep learning models of chromatin accessibility reveal cis-regulatory sequence syntax, transcription factor footprints and regulatory variants by Anusri Pampari*, Anna Shcherbina*, Anshul Kundaje. (*authors contributed equally)
 - This repo also refers to bpnet-lite and uses tangermeme for interpretation, two very useful repos by Jacob Schreiber
 
official chrombpnet (left) vs pytorch chrombpnet (right)
Here is the genome browser to compare the profile prediction and attribution scores between official ChromBPNet and pytorch implementation with n_filters = 512 and 128
pip install chrombpnet-pytorch
pip install git+https://github.com/jsxlei/chrombpnet-pytorch.git
- Download the genome or use your own genome data
 - Download the ENCODE K562 ATAC data or use your own ATAC data
 
Please refer to data_config to define your own dataset or pass them through command.
if your <data_path> contains: peaks.bed, negatives.bed, and unstranded.bw, and bias_scaled.h5 as well.
chrombpnet train --data_dir <data_path>
Otherwise
chrombpnet train --peaks <peak_file> --negatives <negative_file> --bigwig <unstrand.bw> --bias <bias_scaled.h5> --adjust_bias
chrombpnet predict --data_dir <data_path> --checkpoint chrombpnet_wo_bias.h5/best_model.cpkt/chrombpnet_wo_bias.pt -o <output_path>
chrombpnet interpret --data_dir <data_path> --checkpoint <model_cpkt/model_h5> -o <output_path>
chrombpnet --data_dir <data_path> -o <output_path>
chrombpnet finetune --data_dir <data_path> --checkpoint <model>.h5/pt -o <output_path>
snp_score 
 	-l $snps \
 	-g $ref_fasta \
 	-pg $ref_fasta_peaks \
 	-s $chrom_sizes \
 	-ps $chrom_sizes_peaks \
 	-m $model \
 	-p $peaks \
 	-o $out_prefix \
 	-t 2 \
 	-li \
 	-sc chrombpnet
--bigwig--peaks--negatives
The ouput directory will be populated as follows with fold_0 chromosome splits -
fold_0\
	checkpoints\
		best_model.cpkt
		last.cpkt
		chrombpnet_nobias.pt (pytorch i.e model to predict bias corrected accessibility profile) 
	train.log
	predict.log
	evaluation\
		eval\
			all_regions.counts_pearsonr.png
			all_regions_jsd.profile_jsd.png  
			peaks.counts_pearsonr.png  
			peaks_jsd.profile_jsd.png  
			regions.csv
			metrics.json
	interpret\
		counts\
If you're using ChromBPNet in your work, please cite as follows:
@article {Pampari2024.12.25.630221,
	author = {Pampari, Anusri and Shcherbina, Anna and Kvon, Evgeny and Kosicki, Michael and Nair, Surag and Kundu, Soumya and Kathiria, Arwa S. and Risca, Viviana I. and Kuningas, Kristiina and Alasoo, Kaur and Greenleaf, William James and Pennacchio, Len A. and Kundaje, Anshul},
	title = {ChromBPNet: bias factorized, base-resolution deep learning models of chromatin accessibility reveal cis-regulatory sequence syntax, transcription factor footprints and regulatory variants},
	elocation-id = {2024.12.25.630221},
	year = {2024},
	doi = {10.1101/2024.12.25.630221},
	publisher = {Cold Spring Harbor Laboratory},
	URL = {https://www.biorxiv.org/content/early/2024/12/25/2024.12.25.630221},
	eprint = {https://www.biorxiv.org/content/early/2024/12/25/2024.12.25.630221.full.pdf},
	journal = {bioRxiv}
}



