Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error in topiary-seed-to-alignment #33

Open
lbleicher opened this issue Feb 23, 2023 · 25 comments
Open

Error in topiary-seed-to-alignment #33

lbleicher opened this issue Feb 23, 2023 · 25 comments

Comments

@lbleicher
Copy link

I attempted to create an alignment from a seed of six sequences from four species (this is my input csv file):

species,name,aliases,sequence,accession
Homo sapiens,TTHY_HUMAN,hTTR,GPTGTGESKCPLMVKVLDAVRGSPAINVAVHVFRKAADDTWEPFASGKTSESGELHGLTTEEEFVEGIYKVEIDTKSYWKALGISPFHEHAEVVFTANDSGPRRYTIAALLSPYSYSTTAVVTNPKE,P02766
Saccoglossus kowalevskii,D1LXG7,Acorn worm HIUase,MSGYRIDILTNHLRASQAHSNLIEAVNMAGQQSPLTTHVLDTALGRPAAELPITLYSRSPEMAWLKIAAGKTNQDGRCPGLLTQETFHNGVYKIHFDTGTYHKALDTPGFYPYVEVVFEIHDPNQHYHVPLLLSPFSYSTYRGS,D1LXG7
Danio rerio,HIUH_DANRE,Danio Rerio HIUase,MNRLQHIRGHIVSADKHINMSATLLSPLSTHVLNIAQGVPGANMTIVLHRLDPVSSAWNILTTGITNDDGRCPGLITKENFIAGVYKMRFETGKYWDALGETCFYPYVEIVFTITNTSQHYHVPLLLSRFSYSTYRGS,Q06S87
Mus musculus,HIUH_MOUSE,Mouse HIUase,MATESSPLTTHVLDTASGLPAQGLCLRLSRLEAPCQQWMELRTSYTNLDGRCPGLLTPSQIKPGTYKLFFDTERYWKERGQESFYPYVEVVFTITKETQKFHVPLLLSPWSYTTYRGS,Q9CRB3
Mus musculus,TTHY_MOUSE,Mouse Transthyretin,GPAGAGESKCPLMVKVLDAVRGSPAVDVAVKVFKKTSEGSWEPFASGKTAESGELHGLTTDEKFVEGVYRVELDTKSYWKTLGISPFHEFADVVFTANDSGHRHYTIAALLSPYSYSTTAVVSNPQN,P07309

It seems it worked until the reciprocal blast, then I got the following error (it did create a blast results xml file and a initial dataframe file with 3414 lines):

==========

Building initial topiary dataframe.

BLASTing against NCBI database nr
Performing 5 BLAST queries against the NCBI nr database
on 1 threads. Depending on the server load, this could
take awhile. This is a good time to grab a cup of coffee.

BLAST query complete.

Could not parse line MET VARIANT TO 1.7 ANGSTROMS RESOLUTION [Homo sapiens]. Skipping.
Could not parse line MET VARIANT TO 1.7 ANGSTROMS RESOLUTION [Homo sapiens]. Skipping.
Could not parse line MET VARIANT TO 1.7 ANGSTROMS RESOLUTION [Homo sapiens]. Skipping.
Could not parse line MET VARIANT TO 1.7 ANGSTROMS RESOLUTION [Homo sapiens]. Skipping.
Could not parse line MET VARIANT TO 1.7 ANGSTROMS RESOLUTION [Homo sapiens]. Skipping.
Could not parse line MET VARIANT TO 1.7 ANGSTROMS RESOLUTION [Homo sapiens]. Skipping.
Could not parse line MET VARIANT TO 1.7 ANGSTROMS RESOLUTION [Homo sapiens]. Skipping.
Could not parse line MET VARIANT TO 1.7 ANGSTROMS RESOLUTION [Homo sapiens]. Skipping.
Could not parse line MET VARIANT TO 1.7 ANGSTROMS RESOLUTION [Homo sapiens]. Skipping.
Could not parse line MET VARIANT TO 1.7 ANGSTROMS RESOLUTION [Homo sapiens]. Skipping.
Downloading 69 blocks of ~50 sequences...
100%|███████████████████████████████████████████| 69/69 [00:48<00:00, 1.42it/s]
Getting OTT species ids for all species.

Unknown/unrecognized query ids (skipped):
ott4992270
ott615879
ott7659998
ott773491
ott838061
ott898631


Doing reciprocal blast.

Downloading Danio rerio proteome
Downloading proteome for taxid '7955'
Process Process-11:
Traceback (most recent call last):
File "/home/lucas/miniconda3/envs/topiary/lib/python3.11/multiprocessing/process.py", line 314, in _bootstrap
self.run()
File "/home/lucas/miniconda3/envs/topiary/lib/python3.11/multiprocessing/process.py", line 108, in run
self._target(*self._args, **self._kwargs)
File "/home/lucas/miniconda3/envs/topiary/lib/python3.11/site-packages/topiary/_private/ftp.py", line 36, in _ftp_thread
ftp.retrbinary(cmd="RETR " + file_name,
File "/home/lucas/miniconda3/envs/topiary/lib/python3.11/ftplib.py", line 445, in retrbinary
return self.voidresp()
^^^^^^^^^^^^^^^
File "/home/lucas/miniconda3/envs/topiary/lib/python3.11/ftplib.py", line 259, in voidresp
resp = self.getresp()
^^^^^^^^^^^^^^
File "/home/lucas/miniconda3/envs/topiary/lib/python3.11/ftplib.py", line 244, in getresp
resp = self.getmultiline()
^^^^^^^^^^^^^^^^^^^
File "/home/lucas/miniconda3/envs/topiary/lib/python3.11/ftplib.py", line 230, in getmultiline
line = self.getline()
^^^^^^^^^^^^^^
File "/home/lucas/miniconda3/envs/topiary/lib/python3.11/ftplib.py", line 218, in getline
raise EOFError
EOFError
Traceback (most recent call last):
File "/home/lucas/miniconda3/envs/topiary/lib/python3.11/site-packages/topiary/_private/interface.py", line 32, in wrapper
value = func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/home/lucas/miniconda3/envs/topiary/lib/python3.11/site-packages/topiary/pipeline/seed_to_alignment.py", line 406, in seed_to_alignment
proteome_list.append(topiary.ncbi.get_proteome(taxid=this_taxid))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/lucas/miniconda3/envs/topiary/lib/python3.11/site-packages/topiary/ncbi/entrez/proteome.py", line 217, in get_proteome
ncbi_ftp_download(genome_url,file_base="_protein.faa.gz")
File "/home/lucas/miniconda3/envs/topiary/lib/python3.11/site-packages/topiary/ncbi/entrez/download.py", line 80, in ncbi_ftp_download
md5_dict = _read_md5_file(md5_file)
^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/lucas/miniconda3/envs/topiary/lib/python3.11/site-packages/topiary/ncbi/entrez/download.py", line 33, in _read_md5_file
file = col[1][2:].strip()
~~~^^^
IndexError: list index out of range

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File "/home/lucas/miniconda3/envs/topiary/lib/python3.11/site-packages/topiary/_private/wrap.py", line 185, in wrap_function
ret = fcn(**fcn_args.dict)
^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/lucas/miniconda3/envs/topiary/lib/python3.11/site-packages/topiary/_private/interface.py", line 38, in wrapper
raise WrappedFunctionException(err) from e
topiary._private.interface.WrappedFunctionException:

Caught exception in function 'seed_to_alignment'. Returning to starting
directory and cleaning up. Check error stack for cause of
this error.

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File "/home/lucas/miniconda3/envs/topiary/bin/topiary-seed-to-alignment", line 26, in
main()
File "/home/lucas/miniconda3/envs/topiary/bin/topiary-seed-to-alignment", line 21, in main
wrap_function(seed_to_alignment,
File "/home/lucas/miniconda3/envs/topiary/lib/python3.11/site-packages/topiary/_private/wrap.py", line 189, in wrap_function
raise RuntimeError(err) from e
RuntimeError:

Function seed_to_alignment raised an error.

To see command line help, run topiary-seed-to-alignment --help

@harmsm
Copy link
Contributor

harmsm commented Feb 23, 2023

Thanks for the bug report! I've never seen this one before. It looks to me like it is choking when downloading and reading the checksum file to validate the downloaded proteome. Is there a file called md5checksums.txt in the working directory? If so, could you paste its contents here?

Thanks for your help; hopefully we can resolve this quickly.

@lbleicher
Copy link
Author

It does, but it is too long to be copied here. Let me know if the entire file is needed and I'll post it somewhere, here's its top and last lines:

d0e8e6b5c981ff948c657166270a7c88 ./Annotation_comparison/GCF_000002035.6_GRCz11_compare_prev.gbp.gz
9c8cd6fefb81746909c5438c5d18b758 ./Annotation_comparison/GCF_000002035.6_GRCz11_compare_prev.txt.gz
ae405e37cdd4ebbd7d2032baf3e522fd ./annotation_hashes.txt
c249b22d4cf0941cf13f6d626140686c ./GCF_000002035.6_GRCz11_assembly_regions.txt
ce31297f9cb1eccf885afab7fac363ad ./GCF_000002035.6_GRCz11_assembly_report.txt
83c34be20a52645e7ec5e442e33d1ebf ./GCF_000002035.6_GRCz11_assembly_stats.txt
f1de7661b5de92ddf4f2de72a7f2695f ./GCF_000002035.6_GRCz11_assembly_structure/all_alt_scaffold_placement.txt
9585b8ac806c110688debcea17387efa ./GCF_000002035.6_GRCz11_assembly_structure/ALT_DRER_TU_1/alt_scaffolds/AGP/alt.scaf.agp.gz
20b5e35f4c1033ce0426af1c190fe43b ./GCF_000002035.6_GRCz11_assembly_structure/ALT_DRER_TU_1/alt_scaffolds/alignments/NW_018394460.1_NC_007112.7.asn
68ae916c656500b25ee32299657b080b ./GCF_000002035.6_GRCz11_assembly_structure/ALT_DRER_TU_1/alt_scaffolds/alignments/NW_018394460.1_NC_007112.7.gff

(...)

020d65335491f47fa29beadf092e8695 ./GCF_000002035.6_GRCz11_assembly_structure/ALT_DRER_TU_1/alt_scaffolds/alignments/NW_018395039.1_NC_007129.7.asn
711757eada6306b792ed3465a27cdd85 ./GCF_000002035.6_GRCz11_assembly_structure/ALT_DRER_TU_1/alt_scaffolds/alignments/NW_018395039.1_NC_007129.7.gff
527ec55abb1a90ae0cdaac1426704c7d ./GCF_000002035.6_GRCz11_assembly_structure/ALT_DRER_TU_1/alt_scaffolds/alignments/NW_018395040.1_NC_007129.7.asn
d52fb0e9b5f4ee8c2bdb967e539b857e ./GCF_000002035.6_GRCz11_assembly_structure/ALT_DRER_TU_1/alt_scaffolds/alignments/NW_018395040.1_NC_007129.7.gff
e2718bcb708553147de9096927dccc23 ./GCF_000002035.6_GRCz11_assembly_structure/ALT_DRER_TU_1/alt_scaffolds/alignments/NW_018395041.1_NC_007129.7.asn
f9159f5bc4f13df9e81370261d7954f8 ./GCF_000002035.6_GRCz11_assembly_structure/ALT_DRER_TU_1/alt_scaffolds/alignments/NW_018395041.1_NC_007129.7.gff
56d6c327d50909c7370f85a110678949 ./GCF_000002035.6_GRCz11_assembly_structure/ALT_DRER_TU_1/alt_scaffolds/alignments/NW_018395042.1_NC_007129.7.asn
1cbb61a0bc29d549bc47fec6f000c5a4 ./GCF_000002035.6_GRCz11_assembly_structure/ALT_DRER_TU_1/alt_scaffolds/alignments/NW_018395042.1_NC_007129.7.gff
43fe313f031e77461d6ace72c697f5b5 ./GCF_000002035.6_GRCz11_assembly_structure/ALT_DRER_TU_1/alt_scaffolds/alignments/NW_018395043.1_NC_007129.7.asn
1f179ec473e6f382e07cd5a4ed0f37d3 ./GCF_000002035.6_GRCz11_assembly_structure/ALT_DRER_TU_1/alt_scaffolds/alignments/NW_018395043.1_NC_007129.7.gff
99190d7cb0b3e1880e24f6dc51023e31 ./GCF_000002035.6_GRCz11_assembly_structure/ALT_DRER_TU_1/alt_scaffolds/alignments/NW_018395044.1_NC_007129.7.asn
482703be37901d26

@harmsm
Copy link
Contributor

harmsm commented Feb 23, 2023

I think we're getting somewhere. topiary assumes an md5 file has rows that have the format "hash file". It looks like this file is truncated (the last line looks like an incomplete hash). I suspect the md5 download terminated early for some reason.

If this is true, you should be able to re-run and successfully complete the job.

I can patch topiary to prevent this in the future by adding a check to make sure the md5 file downloads successfully, rather than cryptically crashing.

Maybe try re-running the job?

Thanks!

@lbleicher
Copy link
Author

A rerun produced a very similar output - I got a 01_initial-dataframe.csv with the same filesize, the blast result XML is almost the same size (a difference of three lines), and md5checksums.txt is again truncated:

3ce0f863975dd40c4ea48c96478d30ed ./GCF_000002035.6_GRCz11_assembly_structure/ALT_DRER_TU_1/alt_scaffolds/alignments/NW_018394748.1_NC_007120.7.gff
79e2f3921879aaeb9e1514403d22dec8 ./GCF_000002035.6_GRCz11_assembly_structure/ALT_DRER_TU_1/alt_scaffolds/alignments/NW_018394749.1_NC_007120.7.asn
9b2de24652cf7f00d6985fe118f743a4 ./GCF_000002035.6_GRCz11_assembly_structure/ALT_DRER_TU_1/alt_scaffolds/alignments/NW_018394749.1_NC_007120.7.gff
5bccb98a2853b1e9580ccfc54da20b71 ./GCF_000002035.6_GRCz11_assembly_structure/ALT_DRER_TU_1/alt_scaffolds/alignments/NW_018394750.1_NC_007120.7.asn
c81f8f223679341488c081011ba742c3 ./GCF_000002035.6_GRCz11_assembly_structure/ALT_DRER_TU_1/alt_scaffolds/alignments/NW_018394750.1_NC_007120.7.gff
680fd33bd2b0

@harmsm
Copy link
Contributor

harmsm commented Feb 24, 2023

That's strange. I just created a bug fix that downloads the md5sum file, checks if it is sane, then attempts to download it again if it fails. Would you be up for seeing if it fixes your problem? To download the change, you can follow the instructions below:

conda activate topiary
cd the_topiary_directory_wherever_you_downloaded_it
git checkout -b harmsm-main main
git pull [email protected]:harmsm/topiary.git main
python setup.py install

Best,

Mike

@lbleicher
Copy link
Author

lbleicher commented Feb 27, 2023 via email

@jjvanantwerp
Copy link

jjvanantwerp commented Mar 2, 2023

I am having the same error, actually. Below is my output.


Polishing alignment and re-aligning.

muscle 5.1.linux64 [] 396Gb RAM, 40 cores
Built Feb 24 2022 03:16:15
(C) Copyright 2004-2021 Robert C. Edgar.
https://drive5.com

Input: 2 seqs, length avg 392 max 408

00:00 17Mb 50.0% Derep 0 uniques, 0 dupes
00:00 17Mb 100.0% Derep 1 uniques, 0 dupes
00:00 18Mb 50.0% UCLUST 2 seqs EE<0.01, 0 centroids, 0 members
00:00 18Mb 100.0% UCLUST 2 seqs EE<0.01, 1 centroids, 0 members
00:00 18Mb CPU has 40 cores, defaulting to 20 threads
00:00 18Mb 50.0% UCLUST 2 seqs EE<0.30, 0 centroids, 0 members
00:00 18Mb 100.0% UCLUST 2 seqs EE<0.30, 1 centroids, 0 members
00:00 58Mb 100.0% Make cluster MFAs
1 clusters pass 1
1 clusters pass 2
00:00 58Mb
00:00 58Mb Align cluster 1 / 1 (2 seqs)
00:00 58Mb
00:00 58Mb 100.0% Calc posteriors
00:00 58Mb 100.0% UPGMA5
00:00 59Mb 100.0% Consensus sequences

Success. Alignment written to the alignment column in the dataframe.
Traceback (most recent call last):
File "/mnt/home/vanant25/anaconda3/envs/topiary_resolved/lib/python3.11/site-packages/topiary/_private/interface.py", line 32, in wrapper
value = func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/mnt/home/vanant25/anaconda3/envs/topiary_resolved/lib/python3.11/site-packages/topiary/pipeline/seed_to_alignment.py", line 502, in seed_to_alignment
df = topiary.quality.polish_alignment(**kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/mnt/home/vanant25/anaconda3/envs/topiary_resolved/lib/python3.11/site-packages/topiary/quality/polish.py", line 136, in polish_alignment
top_fx_sparse = _get_cutoff(df.fx_in_sparse,pct=fx_sparse_percentile)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/mnt/home/vanant25/anaconda3/envs/topiary_resolved/lib/python3.11/site-packages/topiary/quality/polish.py", line 43, in _get_cutoff
return x[idx]
~^^^^^
IndexError: index 2 is out of bounds for axis 0 with size 2

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File "/mnt/home/vanant25/anaconda3/envs/topiary_resolved/lib/python3.11/site-packages/topiary/_private/wrap.py", line 185, in wrap_function
ret = fcn(**fcn_args.dict)
^^^^^^^^^^^^^^^^^^^^^^^^
File "/mnt/home/vanant25/anaconda3/envs/topiary_resolved/lib/python3.11/site-packages/topiary/_private/interface.py", line 38, in wrapper
raise WrappedFunctionException(err) from e
topiary._private.interface.WrappedFunctionException:

Caught exception in function 'seed_to_alignment'. Returning to starting
directory and cleaning up. Check error stack for cause of
this error.

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File "/mnt/home/vanant25/anaconda3/envs/topiary_resolved/bin/topiary-seed-to-alignment", line 26, in
main()
File "/mnt/home/vanant25/anaconda3/envs/topiary_resolved/bin/topiary-seed-to-alignment", line 21, in main
wrap_function(seed_to_alignment,
File "/mnt/home/vanant25/anaconda3/envs/topiary_resolved/lib/python3.11/site-packages/topiary/_private/wrap.py", line 189, in wrap_function
raise RuntimeError(err) from e
RuntimeError:

Function seed_to_alignment raised an error.

To see command line help, run topiary-seed-to-alignment --help

and the last few lines of my md5checksums.txt are:

75f783e620888f6a20c9e7030bf54de2 ./Gnomon_models/GCF_000001405.40_GRCh38.p14_gnomon_model.gff.gz
962674e06f93bd8656cbd860c395f5ad ./Gnomon_models/GCF_000001405.40_GRCh38.p14_gnomon_protein.faa.gz
f7262c7cc28373fd2aa0a225ef27a50e ./Gnomon_models/GCF_000001405.40_GRCh38.p14_gnomon_rna.fna.gz
3b7f12ebd3d129698e86fb8701bb9688 ./README_patch_release.txt
84b55637f312368687af5e2b545fcb8d ./RefSeq_transcripts_alignments/GCF_000001405.40_GRCh38.p14_knownrefseq_alns.bam
e9e81c03bce9f45f7a4edfe44f0a8f8f ./RefSeq_transcripts_alignments/GCF_000001405.40_GRCh38.p14_knownrefseq_alns.bam.bai
d9ff57b0fdb663665f2d0f9305831b30 ./RefSeq_transcripts_alignments/GCF_000001405.40_GRCh38.p14_modelrefseq_alns.bam
96404b41c1c023019a0ba6514d98c498 ./RefSeq_transcripts_alignments/GCF_000001405.40_GRCh38.p14_modelrefseq_alns.bam.bai

@harmsm
Copy link
Contributor

harmsm commented Mar 2, 2023

Thanks for the report. I just merged the PR I referenced above. I still have not been able to reproduce the error on my end. Can one of you try the command again with the new version? To install the latest version, you could run the following:

cd topiary
git pull origin main
conda activate topiary
python -m pip install . -vv

Thanks! (And thanks for your patience with the delayed response to this thread).

@jjvanantwerp
Copy link

I followed the instructions above, and still ran into the same error. The terminal output is attached:
Terminal SavedOutpiut.txt

@harmsm
Copy link
Contributor

harmsm commented Mar 11, 2023

@jjvanantwerp Thanks for the bug report and sorry for the slow reply. Dangerous having the prof in charge of package maintenance...

I looked through your log file; it appears you're having a different bug. It's crashing when polishing the final alignment. If possible could you please post the last csv file that topiary writes out before the crash occurs? Based on when the crash occurs, I believe this should be 04_aligned-dataframe.csv.

Thanks.

@jjvanantwerp
Copy link

Yes, here it is.
04_aligned-dataframe.csv

@lbleicher
Copy link
Author

lbleicher commented Mar 13, 2023 via email

@harmsm
Copy link
Contributor

harmsm commented Mar 14, 2023

@jjvanantwerp : Thanks for the file! I am able to reproduce the error and am working on this now.

@lbleicher : thanks for the detailed error message. I'll look into.

@harmsm
Copy link
Contributor

harmsm commented Mar 14, 2023

@jjvanantwerp Should be fixed now. I just merged a PR with the change. You should be able to run the following to install the latest and greatest version. Thanks for helping troubleshoot!

cd topiary
git pull origin main
conda activate topiary
python -m pip install . -vv

@jjvanantwerp
Copy link

Yes, I was able to progress past the alignment! I think this issue can be closed. Unfortunately, I will need to open another for what appears to be the same error in the next step. I am not sure if here is the best place to discuss that or if I should open a new issue - it's that same place in the wrap function, line 189.

@harmsm
Copy link
Contributor

harmsm commented Mar 15, 2023 via email

@jjvanantwerp
Copy link

Terminal Saved Output Mar 15.txt

I have attached the whole terminal session, but below is the relevant part. It says the issue is that my alignment is too small, and I'm not sure if there's a way to address this here or upstream.

(topiary_resolved) [vanant25@dev-intel16 topiary]$ topiary-alignment-to-ancestors ER_Final_Alignment.csv --out_dir ER_ASR --num_threads 1

Non-microbial dataset detected. Gene/species tree reconciliation will be performed

Checking raxml-ng

installed:       Y
binary_path:     /mnt/home/vanant25/anaconda3/envs/topiary_resolved/bin/raxml-ng
binary runs:     Y
version:         1.1
minimum version: 1.1
passes:          Y

Checking generax

installed:       Y
binary_path:     /mnt/home/vanant25/anaconda3/envs/topiary_resolved/bin/generax
binary runs:     Y
version:         2.0.4
minimum version: 2.0
passes:          Y

Checking mpirun

installed:       Y
binary_path:     /mnt/home/vanant25/anaconda3/envs/topiary_resolved/bin/mpirun
binary runs:     Y
version:         4.1.5
minimum version: 0.0
passes:          Y

topiary is starting a find_best_model calculation in ./00_find-model:

Generating maximum parsimony tree.

Launching raxml-ng, 0:00:00.007415 (H:M:S)

topiary ran a find_best_model calculation in ./00_find-model:

  • Crashed after 0:00:00.021205 (H:M:S)
  • Please check ./00_find-model/working

Traceback (most recent call last):
File "/mnt/home/vanant25/anaconda3/envs/topiary_resolved/lib/python3.11/site-packages/topiary/_private/interface.py", line 32, in wrapper
value = func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/mnt/home/vanant25/anaconda3/envs/topiary_resolved/lib/python3.11/site-packages/topiary/_private/interface.py", line 336, in launch
raise RuntimeError(err)
RuntimeError: ERROR: /mnt/home/vanant25/anaconda3/envs/topiary_resolved/bin/raxml-ng returned 1


/mnt/home/vanant25/anaconda3/envs/topiary_resolved/bin/raxml-ng output

RAxML-NG v. 1.1 released on 29.11.2021 by The Exelixis Lab.
Developed by: Alexey M. Kozlov and Alexandros Stamatakis.
Contributors: Diego Darriba, Tomas Flouri, Benoit Morel, Sarah Lutteropp, Ben Bettisworth.
Latest version: https://github.com/amkozlov/raxml-ng
Questions/problems/suggestions? Please visit: https://groups.google.com/forum/#!forum/raxml

System: Intel(R) Xeon(R) CPU E5-2680 v4 @ 2.40GHz, 28 cores, 125 GB RAM

RAxML-NG was called at 15-Mar-2023 00:48:29 as follows:

/mnt/home/vanant25/anaconda3/envs/topiary_resolved/bin/raxml-ng --start --msa alignment.phy --model LG --seed 3997117630 --threads 1 --tree pars{1}

Analysis options:
run mode: Starting tree generation
start tree(s): parsimony (1)
random seed: 3997117630
SIMD kernels: AVX2
parallelization: coarse-grained (auto), NONE/sequential

[00:00:00] Reading alignment from file: alignment.phy
[00:00:00] Loaded alignment with 2 taxa and 410 sites

ERROR: Your alignment contains less than 4 sequences!

ERROR: Alignment check failed (see details above)!

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File "/mnt/home/vanant25/anaconda3/envs/topiary_resolved/lib/python3.11/site-packages/topiary/raxml/_raxml.py", line 189, in run_raxml
interface.launch(cmd,
File "/mnt/home/vanant25/anaconda3/envs/topiary_resolved/lib/python3.11/site-packages/topiary/_private/interface.py", line 38, in wrapper
raise WrappedFunctionException(err) from e
topiary._private.interface.WrappedFunctionException:

Caught exception in function 'launch'. Returning to starting
directory and cleaning up. Check error stack for cause of
this error.

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File "/mnt/home/vanant25/anaconda3/envs/topiary_resolved/lib/python3.11/site-packages/topiary/_private/interface.py", line 32, in wrapper
value = func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/mnt/home/vanant25/anaconda3/envs/topiary_resolved/lib/python3.11/site-packages/topiary/raxml/model.py", line 260, in find_best_model
_generate_parsimony_tree(supervisor.alignment,
File "/mnt/home/vanant25/anaconda3/envs/topiary_resolved/lib/python3.11/site-packages/topiary/raxml/model.py", line 45, in _generate_parsimony_tree
run_raxml(run_directory=run_directory,
File "/mnt/home/vanant25/anaconda3/envs/topiary_resolved/lib/python3.11/site-packages/topiary/raxml/_raxml.py", line 197, in run_raxml
raise RuntimeError from e
RuntimeError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File "/mnt/home/vanant25/anaconda3/envs/topiary_resolved/lib/python3.11/site-packages/topiary/_private/interface.py", line 32, in wrapper
value = func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/mnt/home/vanant25/anaconda3/envs/topiary_resolved/lib/python3.11/site-packages/topiary/pipeline/alignment_to_ancestors.py", line 323, in alignment_to_ancestors
topiary.find_best_model(df,
File "/mnt/home/vanant25/anaconda3/envs/topiary_resolved/lib/python3.11/site-packages/topiary/_private/interface.py", line 38, in wrapper
raise WrappedFunctionException(err) from e
topiary._private.interface.WrappedFunctionException:

Caught exception in function 'find_best_model'. Returning to starting
directory and cleaning up. Check error stack for cause of
this error.

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File "/mnt/home/vanant25/anaconda3/envs/topiary_resolved/lib/python3.11/site-packages/topiary/_private/wrap.py", line 185, in wrap_function
ret = fcn(**fcn_args.dict)
^^^^^^^^^^^^^^^^^^^^^^^^
File "/mnt/home/vanant25/anaconda3/envs/topiary_resolved/lib/python3.11/site-packages/topiary/_private/interface.py", line 38, in wrapper
raise WrappedFunctionException(err) from e
topiary._private.interface.WrappedFunctionException:

Caught exception in function 'alignment_to_ancestors'. Returning to starting
directory and cleaning up. Check error stack for cause of
this error.

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File "/mnt/home/vanant25/anaconda3/envs/topiary_resolved/bin/topiary-alignment-to-ancestors", line 26, in
main()
File "/mnt/home/vanant25/anaconda3/envs/topiary_resolved/bin/topiary-alignment-to-ancestors", line 21, in main
wrap_function(alignment_to_ancestors,
File "/mnt/home/vanant25/anaconda3/envs/topiary_resolved/lib/python3.11/site-packages/topiary/_private/wrap.py", line 189, in wrap_function
raise RuntimeError(err) from e
RuntimeError:

Function alignment_to_ancestors raised an error.

To see command line help, run topiary-alignment-to-ancestors --help

(topiary_resolved) [vanant25@dev-intel16 topiary]$

@harmsm
Copy link
Contributor

harmsm commented Mar 15, 2023 via email

@jjvanantwerp
Copy link

The input file was the output of the seed_to_alignment, i thought. I used the 05_clean-aligned-dataframe.csv as the input for ali_to_anc, without any cleaning.

@harmsm
Copy link
Contributor

harmsm commented Mar 15, 2023 via email

@jjvanantwerp
Copy link

No, that's what I did. I was hoping Topiary would 'fill in' around that sequence, but it seems like it's looking for that to be the 'edge' of sequence space instead. I will have to redesign my experiment to incorporate this behavior.

@harmsm
Copy link
Contributor

harmsm commented Mar 15, 2023 via email

@jjvanantwerp
Copy link

jjvanantwerp commented Mar 17, 2023

I've filed out the seed alignment, and ran into an error that I suspect is because of the format of my seed alignment. I have attached the seed alignment. Do you recognize what might cause this:

ER_Seed.csv

File "/mnt/home/vanant25/anaconda3/envs/topiary_resolved/lib/python3.11/site-packages/topiary/opentree/util.py", line 69, in _validate_ott_or_species
raise ValueError(err)
ValueError:
Could not process ott None. Should be an integer
or string with format ottINTEGER

Here is the full error stack:

(topiary_resolved) [vanant25@dev-intel18 topiary]$ topiary-seed-to-alignment ER_Seed.csv --out_dir ER_Align

Checking blastp

installed:       Y
binary_path:     /mnt/home/vanant25/anaconda3/envs/topiary_resolved/bin/blastp
binary runs:     Y
version:         2.13.0+
minimum version: 2.0
passes:          Y

Checking makeblastdb

installed:       Y
binary_path:     /mnt/home/vanant25/anaconda3/envs/topiary_resolved/bin/makeblastdb
binary runs:     Y
version:         2.13.0+
minimum version: 2.0
passes:          Y

Checking muscle

installed:       Y
binary_path:     /mnt/home/vanant25/anaconda3/envs/topiary_resolved/bin/muscle
binary runs:     Y
version:         5.1.linux64
minimum version: 5.0
passes:          Y

Building initial topiary dataframe.

Traceback (most recent call last):
File "/mnt/home/vanant25/anaconda3/envs/topiary_resolved/lib/python3.11/site-packages/topiary/opentree/util.py", line 65, in _validate_ott_or_species
check_ott = int(check_ott)
^^^^^^^^^^^^^^
TypeError: int() argument must be a string, a bytes-like object or a real number, not 'NoneType'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/mnt/home/vanant25/anaconda3/envs/topiary_resolved/lib/python3.11/site-packages/topiary/_private/interface.py", line 32, in wrapper
value = func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/mnt/home/vanant25/anaconda3/envs/topiary_resolved/lib/python3.11/site-packages/topiary/pipeline/seed_to_alignment.py", line 371, in seed_to_alignment
out = topiary.df_from_seed(**kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/mnt/home/vanant25/anaconda3/envs/topiary_resolved/lib/python3.11/site-packages/topiary/io/seed.py", line 312, in df_from_seed
seed_df, key_species, paralog_patterns, species_aware = topiary.io.read_seed(seed_df,
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/mnt/home/vanant25/anaconda3/envs/topiary_resolved/lib/python3.11/site-packages/topiary/io/seed.py", line 126, in read_seed
mrca = topiary.opentree.ott_to_mrca(ott_list=ott_list,
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/mnt/home/vanant25/anaconda3/envs/topiary_resolved/lib/python3.11/site-packages/topiary/opentree/util.py", line 426, in ott_to_mrca
ott_list = _validate_ott_or_species(ott_list,species_list)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/mnt/home/vanant25/anaconda3/envs/topiary_resolved/lib/python3.11/site-packages/topiary/opentree/util.py", line 69, in _validate_ott_or_species
raise ValueError(err)
ValueError:
Could not process ott None. Should be an integer
or string with format ottINTEGER

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File "/mnt/home/vanant25/anaconda3/envs/topiary_resolved/lib/python3.11/site-packages/topiary/_private/wrap.py", line 185, in wrap_function
ret = fcn(**fcn_args.dict)
^^^^^^^^^^^^^^^^^^^^^^^^
File "/mnt/home/vanant25/anaconda3/envs/topiary_resolved/lib/python3.11/site-packages/topiary/_private/interface.py", line 38, in wrapper
raise WrappedFunctionException(err) from e
topiary._private.interface.WrappedFunctionException:

Caught exception in function 'seed_to_alignment'. Returning to starting
directory and cleaning up. Check error stack for cause of
this error.

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File "/mnt/home/vanant25/anaconda3/envs/topiary_resolved/bin/topiary-seed-to-alignment", line 26, in
main()
File "/mnt/home/vanant25/anaconda3/envs/topiary_resolved/bin/topiary-seed-to-alignment", line 21, in main
wrap_function(seed_to_alignment,
File "/mnt/home/vanant25/anaconda3/envs/topiary_resolved/lib/python3.11/site-packages/topiary/_private/wrap.py", line 189, in wrap_function
raise RuntimeError(err) from e
RuntimeError:

Function seed_to_alignment raised an error.

To see command line help, run topiary-seed-to-alignment --help

@harmsm
Copy link
Contributor

harmsm commented Mar 24, 2023

Okay, it should work now. (Or, actually, it should fail now with a useful error). It turns out one of your species, Gulo gulo luscus, is not in the open tree of life database. Topiary was supposed to let you know this was the problem, but was choking on opentreeoflife output. I just pushed a change so it should now do so.

I suspect you want to replace "Gulo gulo luscus" with "Gulo gulo" (https://tree.opentreeoflife.org/taxonomy/browse?id=752563)

Best,

Mike

@jjvanantwerp
Copy link

I changed the species name in the seed alignment, which advanced me further than I have been able to get before. Unfortunately, the alignment hit a critical error again. I have uploaded what I think is the final alignment file that was used.

Terminal Saved Output_Topiary_Error.txt
03_shrunk-dataframe.csv

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants