Starting Trycycler clustering (2020-07-06 14:06:14) Trycycler cluster is a tool for clustering the contigs from multiple different assemblies (e.g. from different assemblers) into highly-similar groups. Input assemblies: A: assemblies/assembly_00.fasta (1,051,794 bp, 2 contigs) B: assemblies/assembly_01.fasta (1,051,803 bp, 2 contigs) C: assemblies/assembly_02.fasta (1,051,824 bp, 2 contigs) D: assemblies/assembly_03.fasta (1,051,822 bp, 2 contigs) E: assemblies/assembly_04.fasta (1,051,828 bp, 2 contigs) F: assemblies/assembly_05.fasta (1,051,924 bp, 2 contigs) G: assemblies/assembly_06.fasta (1,051,927 bp, 2 contigs) H: assemblies/assembly_07.fasta (1,051,944 bp, 2 contigs) I: assemblies/assembly_08.fasta (1,051,915 bp, 2 contigs) J: assemblies/assembly_09.fasta (1,051,927 bp, 2 contigs) Creating output directory: trycycler Input reads: reads.fastq.gz 43,662 reads (210,403,674 bp) N50 = 7,768 bp Checking required software: Mash: v2.1.1 R: v3.6.2 ape: v5.3 phangorn: v2.5.5 Getting contig depths (2020-07-06 14:06:17) Trycycler now aligns the reads to each of the assemblies to assign a read depth value to each of the contigs. Contigs displayed in red have a low read depth and will be filtered out. A (assemblies/assembly_00.fasta): 42,334 alignments, mean depth = 201.7x A_contig_1: 1,044,289 bp, 194.5x A_contig_2: 7,505 bp, 1206.7x B (assemblies/assembly_01.fasta): 42,331 alignments, mean depth = 201.7x B_contig_1: 1,044,294 bp, 194.5x B_contig_2: 7,509 bp, 1202.3x C (assemblies/assembly_02.fasta): 42,331 alignments, mean depth = 201.7x C_contig_1: 1,044,312 bp, 194.5x C_contig_2: 7,512 bp, 1198.0x D (assemblies/assembly_03.fasta): 42,331 alignments, mean depth = 201.7x D_contig_1: 1,044,321 bp, 194.5x D_contig_2: 7,501 bp, 1198.8x E (assemblies/assembly_04.fasta): 42,334 alignments, mean depth = 201.7x E_contig_1: 1,044,320 bp, 194.5x E_contig_2: 7,508 bp, 1201.6x F (assemblies/assembly_05.fasta): 42,329 alignments, mean depth = 201.8x F_utg000001c: 1,044,419 bp, 194.6x F_utg000002c: 7,505 bp, 1202.9x G (assemblies/assembly_06.fasta): 42,332 alignments, mean depth = 201.8x G_utg000001c: 1,044,418 bp, 194.6x G_utg000002c: 7,509 bp, 1207.3x H (assemblies/assembly_07.fasta): 42,331 alignments, mean depth = 201.7x H_utg000001c: 1,044,435 bp, 194.6x H_utg000002c: 7,509 bp, 1197.0x I (assemblies/assembly_08.fasta): 42,331 alignments, mean depth = 201.7x I_utg000001c: 1,044,409 bp, 194.5x I_utg000002c: 7,506 bp, 1198.7x J (assemblies/assembly_09.fasta): 42,330 alignments, mean depth = 201.7x J_utg000001c: 1,044,416 bp, 194.5x J_utg000002c: 7,511 bp, 1201.5x Filtering contigs (2020-07-06 14:08:49) Contigs are now filtered out if they are too short (<1,000 bp) or too low depth (<0.1 times the mean depth). A (assemblies/assembly_00.fasta): all contigs passed filtering B (assemblies/assembly_01.fasta): all contigs passed filtering C (assemblies/assembly_02.fasta): all contigs passed filtering D (assemblies/assembly_03.fasta): all contigs passed filtering E (assemblies/assembly_04.fasta): all contigs passed filtering F (assemblies/assembly_05.fasta): all contigs passed filtering G (assemblies/assembly_06.fasta): all contigs passed filtering H (assemblies/assembly_07.fasta): all contigs passed filtering I (assemblies/assembly_08.fasta): all contigs passed filtering J (assemblies/assembly_09.fasta): all contigs passed filtering Building distance matrix (2020-07-06 14:08:49) Mash is used to build a distance matrix of all contigs in the assemblies. A_contig_1: 0.000 0.250 0.000 0.250 0.000 0.250 0.000 0.250 0.000 0.250 0.000 0.250 0.000 0.250 0.000 0.250 0.000 0.250 0.000 0.250 A_contig_2: 0.250 0.000 0.250 0.001 0.250 0.001 0.250 0.001 0.250 0.001 0.250 0.001 0.250 0.001 0.250 0.001 0.250 0.001 0.250 0.001 B_contig_1: 0.000 0.250 0.000 0.250 0.000 0.250 0.000 0.250 0.000 0.250 0.000 0.250 0.000 0.250 0.000 0.250 0.000 0.250 0.000 0.250 B_contig_2: 0.250 0.001 0.250 0.000 0.250 0.000 0.250 0.000 0.250 0.001 0.250 0.000 0.250 0.000 0.250 0.001 0.250 0.000 0.250 0.000 C_contig_1: 0.000 0.250 0.000 0.250 0.000 0.250 0.000 0.250 0.000 0.250 0.000 0.250 0.000 0.250 0.000 0.250 0.000 0.250 0.000 0.250 C_contig_2: 0.250 0.001 0.250 0.000 0.250 0.000 0.250 0.000 0.250 0.000 0.250 0.000 0.250 0.000 0.250 0.001 0.250 0.001 0.250 0.000 D_contig_1: 0.000 0.250 0.000 0.250 0.000 0.250 0.000 0.250 0.000 0.250 0.000 0.250 0.000 0.250 0.000 0.250 0.000 0.250 0.000 0.250 D_contig_2: 0.250 0.001 0.250 0.000 0.250 0.001 0.250 0.000 0.250 0.001 0.250 0.001 0.250 0.001 0.250 0.001 0.250 0.001 0.250 0.001 E_contig_1: 0.000 0.250 0.000 0.250 0.000 0.250 0.000 0.250 0.000 0.250 0.000 0.250 0.000 0.250 0.000 0.250 0.000 0.250 0.000 0.250 E_contig_2: 0.250 0.001 0.250 0.001 0.250 0.001 0.250 0.001 0.250 0.000 0.250 0.001 0.250 0.001 0.250 0.001 0.250 0.001 0.250 0.001 F_utg000001c: 0.000 0.250 0.000 0.250 0.000 0.250 0.000 0.250 0.000 0.250 0.000 0.250 0.000 0.250 0.000 0.250 0.000 0.250 0.000 0.250 F_utg000002c: 0.250 0.001 0.250 0.000 0.250 0.000 0.250 0.001 0.250 0.001 0.250 0.000 0.250 0.001 0.250 0.001 0.250 0.000 0.250 0.001 G_utg000001c: 0.000 0.250 0.000 0.250 0.000 0.250 0.000 0.250 0.000 0.250 0.000 0.250 0.000 0.250 0.000 0.250 0.000 0.250 0.000 0.250 G_utg000002c: 0.250 0.001 0.250 0.000 0.250 0.000 0.250 0.001 0.250 0.001 0.250 0.001 0.250 0.000 0.250 0.001 0.250 0.000 0.250 0.001 H_utg000001c: 0.000 0.250 0.000 0.250 0.000 0.250 0.000 0.250 0.000 0.250 0.000 0.250 0.000 0.250 0.000 0.250 0.000 0.250 0.000 0.250 H_utg000002c: 0.250 0.001 0.250 0.001 0.250 0.001 0.250 0.001 0.250 0.001 0.250 0.001 0.250 0.001 0.250 0.000 0.250 0.001 0.250 0.001 I_utg000001c: 0.000 0.250 0.000 0.250 0.000 0.250 0.000 0.250 0.000 0.250 0.000 0.250 0.000 0.250 0.000 0.250 0.000 0.250 0.000 0.250 I_utg000002c: 0.250 0.001 0.250 0.000 0.250 0.001 0.250 0.001 0.250 0.001 0.250 0.001 0.250 0.001 0.250 0.001 0.250 0.000 0.250 0.000 J_utg000001c: 0.000 0.250 0.000 0.250 0.000 0.250 0.000 0.250 0.000 0.250 0.000 0.250 0.000 0.250 0.000 0.250 0.000 0.250 0.000 0.250 J_utg000002c: 0.250 0.001 0.250 0.000 0.250 0.000 0.250 0.001 0.250 0.001 0.250 0.001 0.250 0.001 0.250 0.001 0.250 0.000 0.250 0.000 Clustering (2020-07-06 14:08:55) The contigs are now split into clusters using a complete-linkage hierarchical approach. trycycler/cluster_001/1_contigs: trycycler/cluster_001/1_contigs/A_contig_1.fasta: 1,044,289 bp, 194.5x trycycler/cluster_001/1_contigs/B_contig_1.fasta: 1,044,294 bp, 194.5x trycycler/cluster_001/1_contigs/C_contig_1.fasta: 1,044,312 bp, 194.5x trycycler/cluster_001/1_contigs/D_contig_1.fasta: 1,044,321 bp, 194.5x trycycler/cluster_001/1_contigs/E_contig_1.fasta: 1,044,320 bp, 194.5x trycycler/cluster_001/1_contigs/F_utg000001c.fasta: 1,044,419 bp, 194.6x trycycler/cluster_001/1_contigs/G_utg000001c.fasta: 1,044,418 bp, 194.6x trycycler/cluster_001/1_contigs/H_utg000001c.fasta: 1,044,435 bp, 194.6x trycycler/cluster_001/1_contigs/I_utg000001c.fasta: 1,044,409 bp, 194.5x trycycler/cluster_001/1_contigs/J_utg000001c.fasta: 1,044,416 bp, 194.5x trycycler/cluster_002/1_contigs: trycycler/cluster_002/1_contigs/A_contig_2.fasta: 7,505 bp, 1206.7x trycycler/cluster_002/1_contigs/B_contig_2.fasta: 7,509 bp, 1202.3x trycycler/cluster_002/1_contigs/C_contig_2.fasta: 7,512 bp, 1198.0x trycycler/cluster_002/1_contigs/D_contig_2.fasta: 7,501 bp, 1198.8x trycycler/cluster_002/1_contigs/E_contig_2.fasta: 7,508 bp, 1201.6x trycycler/cluster_002/1_contigs/F_utg000002c.fasta: 7,505 bp, 1202.9x trycycler/cluster_002/1_contigs/G_utg000002c.fasta: 7,509 bp, 1207.3x trycycler/cluster_002/1_contigs/H_utg000002c.fasta: 7,509 bp, 1197.0x trycycler/cluster_002/1_contigs/I_utg000002c.fasta: 7,506 bp, 1198.7x trycycler/cluster_002/1_contigs/J_utg000002c.fasta: 7,511 bp, 1201.5x Building FastME tree (2020-07-06 14:08:55) R (ape and phangorn) are used to build a FastME tree of the relationships between the contigs. saving distance matrix: trycycler/contigs.phylip saving tree: trycycler/contigs.newick Finished! (2020-07-06 14:08:56) Now you must decide which clusters are good (i.e. contain well-assembled contigs for replicons in the genome) and which are bad (i.e. contain incomplete or spurious contigs). You can then delete the directories corresponding to the bad clusters and proceed to the next step in the pipeline: trycycler reconcile.