Gaps and filters regions to be removed in fragment profile analysis!!
- gaps_filters_hg19.rdata and gaps_filters_hg38.rdata were produced by the script gaps_filters_hg.R, which includes telomeres, centromeres and ENOCDE blacklist regions!!
- AB_hg19.rdata and ab_hg38.rdata are HiC_AB_Compartments downloaded & liftovered from here
- Download reference genome data
You can download reference genome, pre-build BWA index and annotated regions (e.g., blacklist) from ENCODE for hg38 and hg19 on the command line. The manifest file hg38/hg19.tsv will be generated accordingly. Currently, the ENCODE black list and bwa index are mandatory for the manifest file, which you can also create it by yourself based on
with existing data.
## eg: ./ hg38 /your/genome/data/path/hg38
$ ./assets/Reference/ [GENOME] [DEST_DIR]
- Build reference genomes index If your sequencing libraries come with spike-ins, you can build new aligner index after combining spike-in genome with human genome. The new index information will be appended to corresponding manifest file.
## eg: ./assets/Reference/ hg38 ./data/BAC_F19K16_F24B22.fa hg38_BAC_F19K16_F24B22 /your/genome/data/path/hg38
$ ./assets/Reference/ [GENOME] [SPIKEIN_FA] [INDEX_PREFIX] [DEST_DIR]
Spike-in FASTA sequences for two BACs: F19K16 from Arabidopsis Chr1 and F24B22 from Arabidopsis Chr3, and sytheticDNAs were enclosed.
SyntheticDNA_Arabidopsis_BACs.fa consists of Arabidopsis BAC (F19K16_F24B22) and sythetic DNA sequences.
SyntheticDNA_Arabidopsis_BACs_seqNames.txt: sequences' name
How to forge a BSgenome package for the spike-ins
- Spike-in genome
## Get Fasta sequence and transfer to 2bit format with ucsctools
$ faToTwoBit BCA_F19K16_F24B22.fa BCA_F19K16_F24B22.2bit
- Forge BSgenome package
# prepare the seed file according to BSgenome instruction
# eg: BSgenome.Athaliana.BAC.F19K16.F24B22-seed
- Build package
$ R CMD build /path/to/pkgdir
Full list of commonly used the UMI barcodes for cfMeDIP-seq
- NNT_barcodes.txt ## Barcodes for the pattern of NNT
- UMI_barcodes_OICR.txt ## Barcodes list applied by the OICR protocols