Workflow and scripts for processing eDNA metabarcoding data from Marine Protected Areas in Nova Scotia, Canada.
#1. Study sites
Since 2019, data is included from the Eastern Shore Islands Area of Interest, St. Anns Bank MPA, the Fundian Channel-Browns Bank AOI, the Musquash Estuary MPA, and the Gully MPA.
The following repositories include other analyses conducted for various Maritimes conservation areas:
#2. Bioinformatics
We use a combination of the R package dada2 and the QIIME2 Pipeline to trim raw reads, de-noise, create an amplicon sequence variant (ASV) table, and assign taxonomy to our sequences.
#3. Analyses in R
Analyses of ASV tables, typically with taxonomy associated, are conducted in R using combinations of the ape, vegan, and ggplot2 packages.
- Import data and summarize if de-multiplexed. Check read quality with FastQC.
- Use cutadapt either on its own or in QIIME to remove primers and/or adapters.
- Then use dada2 to denoise paired sequences, ensuring to trim/truncate sequences to an appropriate length.
- Create a phylogenetic tree which aligns sequences using MAFFT and creates an unrooted tree.
- Conduct diversity analyses (alpha and beta diversity, PCoA etc)
- Assign taxonomy to our sequence features using a classifier, BLAST, or FuzzyID2
- Rescript plugin for QIIME was used to create a reference database for 12S and 16S fish sequences. The downloaded sequences can be filtered and evaluated before using to assign our metabarcodes taxonomic classifications.
- In QIIME we use the feature-classifier with our reference classifier object and our representative sequences, and generate a tsv table of classifications.