Skip to content

colindaven/awesome-pangenomes

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

82 Commits
Β 
Β 
Β 
Β 

Repository files navigation

awesome-pangenomes

A list of software capable of analyzing mainly eukaryotic genomes for pangenomics. A new section for microbial genomes has also been added, these tools may not scale to large genomes.

πŸš€ indicates a popular repository

Important blog posts

  • Untangling-graphical-pangenomics Excellent blog by Erik Garrison explaining the differences between rGFA and GFA formats and approaches - important and frequently overlooked

Toolkits

  • odgi Fast toolkit based on odgi format πŸš€
  • vg Full featured construction, mapping and SNP calling toolkit based on multiple formats. πŸš€
  • gaftools Toolkit for GAF (Graph Alignment Format) sorting and manipulation.
  • gfakluge Toolkit and c++ API for GFA manipulation
  • gfatools Toolkit for GFA parsing and conversion
  • gretl Statistics and analysis for GFA files, written in Rust
  • pgr-tk A PanGenomic Research Took Kit, output of this process is not a GFA file.

Pangenome construction

  • Minigraph Fast method by Heng Li, produces referenceGFA (rGFA) format (not GFA or odgi) πŸš€
  • minigraph_cactus and docs Pangenome builder which prioritizes downstream compatibility. Produces GFA and odgi. πŸš€
  • PGGB Pangenome Graph Builder, calculates SNPs as part of the pipeline. Produces GFA and odgi. πŸš€
  • pangene Pangene constructs a pangenome gene graph from one protein set and many genomes and includes simple but effective visualization πŸš€
  • Pantools v3+ Fully featured construction of pangenome graphs
  • PSVCP Add PAV to the linear genome to construct a pangenome.
  • PHG Practical Haplotype Graph
  • PATO R package for pangenome construction
  • Chrom_mini_graph Generate and map reads onto a coloured minimizer pangenome graph
  • GET_PANGENES Perl scripts used by the Ensembl Plants team for pangenomics
  • impg Create an implicit pangenome graph for a homologous target region, then use output bed files to extract sequences for PGGB etc.
  • MGRgraph An algorithm to Build aΒ Multi-genomeΒ Reference (warning - last updated 2018)
  • MEMO MEMO constructs a pangenome and index and allows kmer based conservation analyses and visualization
  • poasta Fast, gap-affine sequence-to-graph and partial order aligner and MSA construction

Pangenome pipelines

  • nf-core pangenome Paper A scalable Nextflow approach to building pangenomes with PGGB with visualization by odgi. πŸš€
  • pangepop A snakemake pipeline to create a pangenome with minigraph-cactus and align reads against it with vg giraffe

Annotating pangenomes

Short read alignment to a pangenome graph

  • vg giraffe Faster and more modern alternative to vg map πŸš€
  • vg map Original vg mapper (superseded by vg giraffe)
  • Hisat2
  • Minigraph Construct graphs or align short or long reads to graphs
  • Chrom_mini_graph Generate and map reads onto a coloured minimizer pangenome graph

Long read alignment to a pangenome graph

  • GraphAligner Fast long read graph aligner πŸš€
  • Minigraph Construct graphs or align short or long reads to graphs
  • GraphChainer Built on codebase of GraphAligner
  • Spades Pathracer Align long reads to genomic graphs
  • Minichain Align long reads to pangenomes in GFA or rGFA format
  • PanAligner Align long reads to pangenomes
  • poasta Fast, gap-affine sequence-to-graph and partial order aligner and MSA construction

SNP callers and genotypers

  • vg call SNP caller for pangenomes, with gam or GAF output πŸš€
  • vg surject surject to linear reference, then use linear SNP caller like Freebayes, Deepvariant etc πŸš€
  • Paragraph A suite of graph-based genotyping tools for short read data
  • Pangenie kmer-based SV genotyping using short reads. Intended for human only (in 2023).
  • Deepvariant Case study of deep variant SNP calling on vg giraffe aligned bam files

Structural Variation (SV) callers and genotypers

  • vg call Call and genotype structural variants on a graph using long and short reads. πŸš€
  • GraphTyper A graph SV genotyper (does not call SVs)
  • Pangenie kmer-based SV genotyping using short reads. Intended for human only (in 2023).
  • SVarp Use long reads to detect structural variants in a GFA format pangenome.
  • bubblegun A tool for detecting Bubbles and Superbubbles
  • PHI Pangenome-based Haplotype Inference preprint A genotyper using low coverage short or long reads for haploid pangenomes, requires Gurobi license.

Repeat analysis tools

*Pantera Identification of transposon element families from a set of pangenomes

Pangenome viewers -interactive

  • Bandage Visualize GFA files in an interactive standalone app πŸš€
  • SeqTubemap Elegant path visualization for smaller regions of a pangenome from the vg team πŸš€
  • MoMI-G Genome graph browser for SVs visualization. User can filter and visualize annotations and inspect SVs with read alignments over the genome graph. πŸš€
  • pangene Pangene can visualize one protein set mapped to x genomes to check synteny and presence/absence of genes. πŸš€
  • Panagram Plots k-mer conservation
  • VAG Visualization of short sequence alignments in a pangenome
  • Panache View linearized pangenomes
  • Waragraph
  • PanGraphViewer Desktop and web versions. Based on cytoscape.js. Can get to chromosome coordinates, allows VCF input.
  • Wally View GFA (Work in progress 2023)
  • VRPG View rGFA or GFA, written in python and html
  • Pantograph is a commercial pangenome graph viewer option
  • PGV A web based viewer similar to SeqTubeMap
  • Pancat Scripts to filter and visualize GFA files
  • gfaestus GFA visualizer, GPU-accelerated using Vulkan
  • gfaviz Graphical interactive tool for the visualization of sequence graphs in GFA format
  • AGB Interactive assembly graph browser
  • graphgenomeviewer Web based viewer for small to medium GFA files
  • JBrowse 2 Web based genome browser with synteny views and plugins for multiple-alignments that can be extracted from Cactus graphs (https://github.com/cmdcolin/jbrowse-plugin-mafviewer)
  • strangepg A modern GFA viewer and alternative to the Bandage tool

Pangenome viewers -static

  • vg view - generates static images
  • odgi - generates static images πŸš€
  • plotsr - generates static images

Graph validation tools

Pangenome comparison

  • junctions Pangenome comparison using elastic-degenerate strings.
  • rs-pancat-compare Pairwise pangenome graph comparison by the computation of a segmentation edit distance.

Pangenome tools for microbes

  • anvi'o Microbial pangenomics - Annotation, Construction, Visualization and Manipulation (Eukaryote too excepted annotation)
  • Roary A well-documented and feature-rich tool which works on Prokka gff files and has an entertaining FAQ.

File formats

Miscellaneous tools

  • gfainject Map short alignments in BAM format to a GFA (seems it is not a real aligner but a conversion tool). Output in GAF format.
  • GRAFIMO GRAph-based Finding of Individual Motif Occurrences using vg
  • rs-gfa A GFA parser in Rust.
  • ropebwt3 Can construct and align sequences against huge TB scale references and retrieve haplotypes.

kmer based approaches

Libraries to explore pangenomes

  • gfapy implements GFA1 and GFA2 parsing and scalable exploration of graphs in Python
  • gfagraphs implements rGFA and GFA1 parsing and editing of graphs in Python
  • graphanalyzer a python package to read and analyze the PAF and the GFA files for the graphs.

Other lists of pangenome tools

Contributions

Is something missing? Contributions are welcome, please make PRs to main or write an issue with a link.

About

A list of software for pangenomics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published