Panmethyl maps methylation data from long-reads to pangenomes.
It takes the following inputs:
--out
- directory in which panmethyl will write the output files.--bams
- CSV file listing the BAM files to be mapped to the pangenome.
The format of this CSV file is:
sample,path
name1,path/to/bam1
name2,path/to/bam2
These BAM files must be annotated with the appropriate methylation information.
For example, the location of modified bases must be encoded in the MM
tag
and the likelihoods of methylation must be encoded in the ML
tag.
--graph
- Pangenome in rGFA format. Currently, only graphs created withminigraph
are supported, but can be extended to all graphs by replacing minigraph with other graph aligners.
Panmethyl outputs a .graphMethylaion
plain text file for each entry in
--bams
. This file is a CSV file listing the graph node, the position of the
modified base, its strand, the coverage on the modified base, and the average
methylation level, encoded on a scale from 0 to 255 (as in the ML tag).
- Index the position of every CpG dinucleotide in the input graph (
bin/index_cpg.py
). - Convert BAM file to FASTQ (
samtools
). - Annotate reads in FASTQ with methylation information from the BAM (
tagtobed
). - Map the FASTQ to the pangenome with
minigraph
. - Lift the methylation annotation from the reads to the graph (
bin/lift_5mC.py
). - Count the average methylation level of CpGs (
bin/nodes_methylation.py
).