Starting Trycycler reconcile (2020-07-06 14:11:40) Trycycler reconcile is a tool for reconciling multiple alternative contigs with each other. Input reads: reads.fastq.gz size = 216,246,191 bytes Input contigs: trycycler/cluster_002/1_contigs/A_contig_2.fasta (7,505 bp) trycycler/cluster_002/1_contigs/B_contig_2.fasta (7,509 bp) trycycler/cluster_002/1_contigs/C_contig_2.fasta (7,512 bp) trycycler/cluster_002/1_contigs/D_contig_2.fasta (7,501 bp) trycycler/cluster_002/1_contigs/E_contig_2.fasta (7,508 bp) trycycler/cluster_002/1_contigs/F_utg000002c.fasta (7,505 bp) trycycler/cluster_002/1_contigs/G_utg000002c.fasta (7,509 bp) trycycler/cluster_002/1_contigs/H_utg000002c.fasta (7,509 bp) trycycler/cluster_002/1_contigs/I_utg000002c.fasta (7,506 bp) trycycler/cluster_002/1_contigs/J_utg000002c.fasta (7,511 bp) Checking required software: minimap2: v2.17-r954-dirty Initial check of contigs (2020-07-06 14:11:40) Before proceeding, Trycycler ensures that the input contigs appear sufficiently close to each other to make a consensus. If not, the program will quit and the user must fix the input contigs (make them more similar to each other) or exclude some before trying again. Relative sequence lengths: A_contig_2: 1.000 0.999 0.999 1.001 1.000 1.000 0.999 0.999 1.000 0.999 B_contig_2: 1.001 1.000 1.000 1.001 1.000 1.001 1.000 1.000 1.000 1.000 C_contig_2: 1.001 1.000 1.000 1.001 1.001 1.001 1.000 1.000 1.001 1.000 D_contig_2: 0.999 0.999 0.999 1.000 0.999 0.999 0.999 0.999 0.999 0.999 E_contig_2: 1.000 1.000 0.999 1.001 1.000 1.000 1.000 1.000 1.000 1.000 F_utg000002c: 1.000 0.999 0.999 1.001 1.000 1.000 0.999 0.999 1.000 0.999 G_utg000002c: 1.001 1.000 1.000 1.001 1.000 1.001 1.000 1.000 1.000 1.000 H_utg000002c: 1.001 1.000 1.000 1.001 1.000 1.001 1.000 1.000 1.000 1.000 I_utg000002c: 1.000 1.000 0.999 1.001 1.000 1.000 1.000 1.000 1.000 0.999 J_utg000002c: 1.001 1.000 1.000 1.001 1.000 1.001 1.000 1.000 1.001 1.000 Mash distances: A_contig_2: 0.000 0.001 0.001 0.001 0.001 0.001 0.001 0.001 0.001 0.001 B_contig_2: 0.001 0.000 0.000 0.000 0.001 0.000 0.000 0.001 0.000 0.000 C_contig_2: 0.001 0.000 0.000 0.000 0.000 0.000 0.000 0.001 0.001 0.000 D_contig_2: 0.001 0.000 0.001 0.000 0.001 0.001 0.001 0.001 0.001 0.001 E_contig_2: 0.001 0.001 0.001 0.001 0.000 0.001 0.001 0.001 0.001 0.001 F_utg000002c: 0.001 0.000 0.000 0.001 0.001 0.000 0.001 0.001 0.000 0.001 G_utg000002c: 0.001 0.000 0.000 0.001 0.001 0.001 0.000 0.001 0.000 0.001 H_utg000002c: 0.001 0.001 0.001 0.001 0.001 0.001 0.001 0.000 0.001 0.001 I_utg000002c: 0.001 0.000 0.001 0.001 0.001 0.001 0.001 0.001 0.000 0.000 J_utg000002c: 0.001 0.000 0.000 0.001 0.001 0.001 0.001 0.001 0.000 0.000 Contigs have passed the initial check - they seem sufficiently close to reconcile. Normalising strands (2020-07-06 14:11:41) In this step, Trycycler ensures that all sequences are on the same strand. It does this by first finding a sequence that occurs once in each contig and then flipping any of the contigs (converting to their reverse complement sequence) which have this sequence on the negative strand. Randomly-chosen common sequence: TTATCCTCAGAAGTTTATGCACTTTCTACAAGAGTACATCGGTCAACGAA GAGGTTTTGTCTTCGTAACTCGCTCCGGAAAAATGGTGGGGTTAAGGCAA ATCGCCCGCACGTTCTCTCAAGCAGGACTACAAGCTGCAATCCCTTTTAA GATAACCCCGCACGTGCTTCGAGCAACCGCTGTGACGGAGTACAAACGCC TAGGGTGCTCAGACTCCGACATAATGAAGGTCACGGGACACGCAACCGCA A_contig_2: + strand (using original sequence) B_contig_2: + strand (using original sequence) C_contig_2: - strand (using reverse complement) D_contig_2: + strand (using original sequence) E_contig_2: + strand (using original sequence) F_utg000002c: + strand (using original sequence) G_utg000002c: + strand (using original sequence) H_utg000002c: + strand (using original sequence) I_utg000002c: - strand (using reverse complement) J_utg000002c: + strand (using original sequence) Circularisation (2020-07-06 14:11:41) Trycycler now compares the contigs to each other to repair any circularisation issues. After this step, each sequence should be cleanly circularised - i.e. the first base in the contig immediately follows the last base. Each contig will be circularised by looking for the position of its start and end in the other contigs. If necessary, additional sequence will be added or duplicated sequence will be removed. If there are multiple possible ways to fix a contig's circularisation, then Trycycler will use read alignments to choose the best one. Circularising A_contig_2: using B_contig_2: circularising A_contig_2 by trimming 1 bp of sequence from the end using C_contig_2: circularising A_contig_2 by trimming 1 bp of sequence from the end using D_contig_2: circularising A_contig_2 by trimming 1 bp of sequence from the end using E_contig_2: circularising A_contig_2 by trimming 1 bp of sequence from the end using F_utg000002c: circularising A_contig_2 by trimming 1 bp of sequence from the end using G_utg000002c: circularising A_contig_2 by trimming 1 bp of sequence from the end using H_utg000002c: circularising A_contig_2 by trimming 1 bp of sequence from the end using I_utg000002c: circularising A_contig_2 by trimming 1 bp of sequence from the end using J_utg000002c: circularising A_contig_2 by trimming 1 bp of sequence from the end circularisation complete (7,504 bp) Circularising B_contig_2: using A_contig_2: no adjustment needed (B_contig_2 is already circular) using C_contig_2: no adjustment needed (B_contig_2 is already circular) using D_contig_2: no adjustment needed (B_contig_2 is already circular) using E_contig_2: no adjustment needed (B_contig_2 is already circular) using F_utg000002c: unable to circularise: B_contig_2's end could not be found in F_utg000002c using G_utg000002c: no adjustment needed (B_contig_2 is already circular) using H_utg000002c: no adjustment needed (B_contig_2 is already circular) using I_utg000002c: no adjustment needed (B_contig_2 is already circular) using J_utg000002c: unable to circularise: B_contig_2's start could not be found in J_utg000002c circularisation complete (7,509 bp) Circularising C_contig_2: using A_contig_2: no adjustment needed (C_contig_2 is already circular) using B_contig_2: no adjustment needed (C_contig_2 is already circular) using D_contig_2: unable to circularise: C_contig_2's start could not be found in D_contig_2 using E_contig_2: no adjustment needed (C_contig_2 is already circular) using F_utg000002c: no adjustment needed (C_contig_2 is already circular) using G_utg000002c: no adjustment needed (C_contig_2 is already circular) using H_utg000002c: no adjustment needed (C_contig_2 is already circular) using I_utg000002c: unable to circularise: C_contig_2's start could not be found in I_utg000002c using J_utg000002c: no adjustment needed (C_contig_2 is already circular) circularisation complete (7,512 bp) Circularising D_contig_2: using A_contig_2: circularising D_contig_2 by adding 7 bp of sequence from A_contig_2 (3851-3858) using B_contig_2: circularising D_contig_2 by adding 7 bp of sequence from B_contig_2 (3004-3011) using C_contig_2: unable to circularise: D_contig_2's end could not be found in C_contig_2 using E_contig_2: circularising D_contig_2 by adding 7 bp of sequence from E_contig_2 (6663-6670) using F_utg000002c: circularising D_contig_2 by adding 7 bp of sequence from F_utg000002c (3077-3084) using G_utg000002c: circularising D_contig_2 by adding 7 bp of sequence from G_utg000002c (5418-5425) using H_utg000002c: circularising D_contig_2 by adding 7 bp of sequence from H_utg000002c (1279-1286) using I_utg000002c: unable to circularise: D_contig_2's start could not be found in I_utg000002c using J_utg000002c: circularising D_contig_2 by adding 7 bp of sequence from J_utg000002c (2660-2667) circularisation complete (7,508 bp) Circularising E_contig_2: using A_contig_2: no adjustment needed (E_contig_2 is already circular) using B_contig_2: no adjustment needed (E_contig_2 is already circular) using C_contig_2: no adjustment needed (E_contig_2 is already circular) using D_contig_2: no adjustment needed (E_contig_2 is already circular) using F_utg000002c: no adjustment needed (E_contig_2 is already circular) using G_utg000002c: no adjustment needed (E_contig_2 is already circular) using H_utg000002c: no adjustment needed (E_contig_2 is already circular) using I_utg000002c: no adjustment needed (E_contig_2 is already circular) using J_utg000002c: no adjustment needed (E_contig_2 is already circular) circularisation complete (7,508 bp) Circularising F_utg000002c: using A_contig_2: circularising F_utg000002c by adding 18 bp of sequence from A_contig_2 (758-776) using B_contig_2: unable to circularise: F_utg000002c's start could not be found in B_contig_2 using C_contig_2: circularising F_utg000002c by adding 18 bp of sequence from C_contig_2 (4574-4592) using D_contig_2: circularising F_utg000002c by adding 18 bp of sequence from D_contig_2 (4406-4424) using E_contig_2: circularising F_utg000002c by adding 18 bp of sequence from E_contig_2 (3569-3587) using G_utg000002c: circularising F_utg000002c by adding 18 bp of sequence from G_utg000002c (2324-2342) using H_utg000002c: circularising F_utg000002c by adding 18 bp of sequence from H_utg000002c (5693-5711) using I_utg000002c: circularising F_utg000002c by adding 18 bp of sequence from I_utg000002c (4315-4333) using J_utg000002c: unable to circularise: F_utg000002c's start could not be found in J_utg000002c circularisation complete (7,523 bp) Circularising G_utg000002c: using A_contig_2: no adjustment needed (G_utg000002c is already circular) using B_contig_2: no adjustment needed (G_utg000002c is already circular) using C_contig_2: no adjustment needed (G_utg000002c is already circular) using D_contig_2: no adjustment needed (G_utg000002c is already circular) using E_contig_2: no adjustment needed (G_utg000002c is already circular) using F_utg000002c: no adjustment needed (G_utg000002c is already circular) using H_utg000002c: no adjustment needed (G_utg000002c is already circular) using I_utg000002c: no adjustment needed (G_utg000002c is already circular) using J_utg000002c: no adjustment needed (G_utg000002c is already circular) circularisation complete (7,509 bp) Circularising H_utg000002c: using A_contig_2: no adjustment needed (H_utg000002c is already circular) using B_contig_2: no adjustment needed (H_utg000002c is already circular) using C_contig_2: no adjustment needed (H_utg000002c is already circular) using D_contig_2: no adjustment needed (H_utg000002c is already circular) using E_contig_2: no adjustment needed (H_utg000002c is already circular) using F_utg000002c: no adjustment needed (H_utg000002c is already circular) using G_utg000002c: no adjustment needed (H_utg000002c is already circular) using I_utg000002c: no adjustment needed (H_utg000002c is already circular) using J_utg000002c: no adjustment needed (H_utg000002c is already circular) circularisation complete (7,509 bp) Circularising I_utg000002c: using A_contig_2: no adjustment needed (I_utg000002c is already circular) using B_contig_2: no adjustment needed (I_utg000002c is already circular) using C_contig_2: unable to circularise: I_utg000002c's end could not be found in C_contig_2 using D_contig_2: unable to circularise: I_utg000002c's end could not be found in D_contig_2 using E_contig_2: no adjustment needed (I_utg000002c is already circular) using F_utg000002c: no adjustment needed (I_utg000002c is already circular) using G_utg000002c: no adjustment needed (I_utg000002c is already circular) using H_utg000002c: no adjustment needed (I_utg000002c is already circular) using J_utg000002c: no adjustment needed (I_utg000002c is already circular) circularisation complete (7,506 bp) Circularising J_utg000002c: using A_contig_2: no adjustment needed (J_utg000002c is already circular) using B_contig_2: unable to circularise: J_utg000002c's end could not be found in B_contig_2 using C_contig_2: no adjustment needed (J_utg000002c is already circular) using D_contig_2: no adjustment needed (J_utg000002c is already circular) using E_contig_2: no adjustment needed (J_utg000002c is already circular) using F_utg000002c: unable to circularise: J_utg000002c's end could not be found in F_utg000002c using G_utg000002c: no adjustment needed (J_utg000002c is already circular) using H_utg000002c: no adjustment needed (J_utg000002c is already circular) using I_utg000002c: no adjustment needed (J_utg000002c is already circular) circularisation complete (7,511 bp) Finding starting sequence (2020-07-06 14:11:43) In this step, Trycycler finds a sequence to use as a starting point for each of the contigs. This can be a standard starting point (e.g. the dnaA gene) or if one is not found, then a randomly-chosen unique sequence will be used. If necessary, the sequences will be flipped (converted to their reverse complement sequence) to ensure that the starting sequence is on the positive strand. Looking for known starting sequences in each contig... Unable to find a suitable known starting sequence Randomly-chosen common sequence: CAGAAGTTTATGCACTTTCTACAAGAGTACATCGGTCAACGAAGAGGTTT TGTCTTCGTAACTCGCTCCGGAAAAATGGTGGGGTTAAGGCAAATCGCCC GCACGTTCTCTCAAGCAGGACTACAAGCTGCAATCCCTTTTAAGATAACC CCGCACGTGCTTCGAGCAACCGCTGTGACGGAGTACAAACGCCTAGGGTG CTCAGACTCCGACATAATGAAGGTCACGGGACACGCAACCGCAAAGATGA A_contig_2: + strand (using original sequence) B_contig_2: + strand (using original sequence) C_contig_2: + strand (using original sequence) D_contig_2: + strand (using original sequence) E_contig_2: + strand (using original sequence) F_utg000002c: + strand (using original sequence) G_utg000002c: + strand (using original sequence) H_utg000002c: + strand (using original sequence) I_utg000002c: + strand (using original sequence) J_utg000002c: + strand (using original sequence) Rotating contigs to starting sequence (2020-07-06 14:11:44) For a circular contig, any point in the sequence is a valid starting position and it can thus be 'rotated' by moving sequence from the contig start to the contig end. In this step, Trycycler rotates each contig such that it begins with the starting sequence, ensuring that all contigs begin and end together so they can be aligned to each other. A_contig_2: rotating by 1,831 bp CAGAAGTTTATGCACTTTCT...GGATTATTATAACTTATCCT (7,504 bp) B_contig_2: rotating by 982 bp CAGAAGTTTATGCACTTTCT...GGATTATTATAACTTATCCT (7,509 bp) C_contig_2: rotating by 5,647 bp CAGAAGTTTATGCACTTTCT...GGATTATTATAACTTATCCT (7,512 bp) D_contig_2: rotating by 5,479 bp CAGAAGTTTATGCACTTTCT...GGATTATTATAACTTATCCT (7,508 bp) E_contig_2: rotating by 4,642 bp CAGAAGTTTATGCACTTTCT...GGATTATTATAACTTATCCT (7,508 bp) F_utg000002c: rotating by 1,055 bp CAGAAGTTTATGCACTTTCT...GGATTATTATAACTTATCCT (7,523 bp) G_utg000002c: rotating by 3,397 bp CAGAAGTTTATGCACTTTCT...GGATTATTATAACTTATCCT (7,509 bp) H_utg000002c: rotating by 6,765 bp CAGAAGTTTATGCACTTTCT...GGATTATTATAACTTATCCT (7,509 bp) I_utg000002c: rotating by 5,388 bp CAGAAGTTTATGCACTTTCT...GGATTATTATAACTTATCCT (7,506 bp) J_utg000002c: rotating by 638 bp CAGAAGTTTATGCACTTTCT...GGATTATTATAACTTATCCT (7,511 bp) Pairwise global alignments (2020-07-06 14:11:44) Trycycler uses the edlib aligner to get global alignments between all pairs of sequences. This can help you to spot any problematic sequences that should be excluded before continuing. If you see any sequences with notably worse identities or max indels, you can remove them (delete the contig's FASTA) and run this command again. A_contig_2 vs B_contig_2... 99.91% identity, max indel = 1 A_contig_2 vs C_contig_2... 99.89% identity, max indel = 2 A_contig_2 vs D_contig_2... 99.92% identity, max indel = 1 A_contig_2 vs E_contig_2... 99.92% identity, max indel = 1 A_contig_2 vs F_utg000002c... 99.75% identity, max indel = 5 A_contig_2 vs G_utg000002c... 99.88% identity, max indel = 1 A_contig_2 vs H_utg000002c... 99.77% identity, max indel = 2 A_contig_2 vs I_utg000002c... 99.87% identity, max indel = 3 A_contig_2 vs J_utg000002c... 99.88% identity, max indel = 1 B_contig_2 vs C_contig_2... 99.96% identity, max indel = 2 B_contig_2 vs D_contig_2... 99.99% identity, max indel = 1 B_contig_2 vs E_contig_2... 99.96% identity, max indel = 1 B_contig_2 vs F_utg000002c... 99.81% identity, max indel = 5 B_contig_2 vs G_utg000002c... 99.95% identity, max indel = 1 B_contig_2 vs H_utg000002c... 99.83% identity, max indel = 1 B_contig_2 vs I_utg000002c... 99.93% identity, max indel = 3 B_contig_2 vs J_utg000002c... 99.95% identity, max indel = 1 C_contig_2 vs D_contig_2... 99.95% identity, max indel = 2 C_contig_2 vs E_contig_2... 99.95% identity, max indel = 2 C_contig_2 vs F_utg000002c... 99.80% identity, max indel = 5 C_contig_2 vs G_utg000002c... 99.93% identity, max indel = 2 C_contig_2 vs H_utg000002c... 99.81% identity, max indel = 2 C_contig_2 vs I_utg000002c... 99.92% identity, max indel = 3 C_contig_2 vs J_utg000002c... 99.93% identity, max indel = 2 D_contig_2 vs E_contig_2... 99.95% identity, max indel = 1 D_contig_2 vs F_utg000002c... 99.80% identity, max indel = 5 D_contig_2 vs G_utg000002c... 99.93% identity, max indel = 1 D_contig_2 vs H_utg000002c... 99.81% identity, max indel = 1 D_contig_2 vs I_utg000002c... 99.92% identity, max indel = 3 D_contig_2 vs J_utg000002c... 99.93% identity, max indel = 1 E_contig_2 vs F_utg000002c... 99.80% identity, max indel = 5 E_contig_2 vs G_utg000002c... 99.93% identity, max indel = 1 E_contig_2 vs H_utg000002c... 99.81% identity, max indel = 1 E_contig_2 vs I_utg000002c... 99.92% identity, max indel = 3 E_contig_2 vs J_utg000002c... 99.93% identity, max indel = 1 F_utg000002c vs G_utg000002c... 99.79% identity, max indel = 5 F_utg000002c vs H_utg000002c... 99.67% identity, max indel = 5 F_utg000002c vs I_utg000002c... 99.77% identity, max indel = 5 F_utg000002c vs J_utg000002c... 99.79% identity, max indel = 5 G_utg000002c vs H_utg000002c... 99.80% identity, max indel = 1 G_utg000002c vs I_utg000002c... 99.91% identity, max indel = 3 G_utg000002c vs J_utg000002c... 99.92% identity, max indel = 1 H_utg000002c vs I_utg000002c... 99.79% identity, max indel = 3 H_utg000002c vs J_utg000002c... 99.80% identity, max indel = 1 I_utg000002c vs J_utg000002c... 99.91% identity, max indel = 3 Pairwise identities: A_contig_2: 100.00% 99.91% 99.89% 99.92% 99.92% 99.75% 99.88% 99.77% 99.87% 99.88% B_contig_2: 99.91% 100.00% 99.96% 99.99% 99.96% 99.81% 99.95% 99.83% 99.93% 99.95% C_contig_2: 99.89% 99.96% 100.00% 99.95% 99.95% 99.80% 99.93% 99.81% 99.92% 99.93% D_contig_2: 99.92% 99.99% 99.95% 100.00% 99.95% 99.80% 99.93% 99.81% 99.92% 99.93% E_contig_2: 99.92% 99.96% 99.95% 99.95% 100.00% 99.80% 99.93% 99.81% 99.92% 99.93% F_utg000002c: 99.75% 99.81% 99.80% 99.80% 99.80% 100.00% 99.79% 99.67% 99.77% 99.79% G_utg000002c: 99.88% 99.95% 99.93% 99.93% 99.93% 99.79% 100.00% 99.80% 99.91% 99.92% H_utg000002c: 99.77% 99.83% 99.81% 99.81% 99.81% 99.67% 99.80% 100.00% 99.79% 99.80% I_utg000002c: 99.87% 99.93% 99.92% 99.92% 99.92% 99.77% 99.91% 99.79% 100.00% 99.91% J_utg000002c: 99.88% 99.95% 99.93% 99.93% 99.93% 99.79% 99.92% 99.80% 99.91% 100.00% Maximum insertion/deletion sizes: A_contig_2: 0 1 2 1 1 5 1 2 3 1 B_contig_2: 1 0 2 1 1 5 1 1 3 1 C_contig_2: 2 2 0 2 2 5 2 2 3 2 D_contig_2: 1 1 2 0 1 5 1 1 3 1 E_contig_2: 1 1 2 1 0 5 1 1 3 1 F_utg000002c: 5 5 5 5 5 0 5 5 5 5 G_utg000002c: 1 1 2 1 1 5 0 1 3 1 H_utg000002c: 2 1 2 1 1 5 1 0 3 1 I_utg000002c: 3 3 3 3 3 5 3 3 0 3 J_utg000002c: 1 1 2 1 1 5 1 1 3 0 Finished! (2020-07-06 14:11:44) All contig sequences are now reconciled and ready for the next step in the pipeline: trycycler msa. Saving sequences to file: trycycler/cluster_002/2_all_seqs.fasta