-
Notifications
You must be signed in to change notification settings - Fork 206
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GPU version has corrupted sequence output #912
Comments
Good catch, I did not think this through completely. It's not actually corrupted, this is the new byte encoding necessary for the GPU search (with byte values 0 to 64 encoding masked or unmasked amino acids). I will fix the display issue asap. |
martin-steinegger
added a commit
to martin-steinegger/MMseqs2
that referenced
this issue
Jan 6, 2025
milot-mirdita
added a commit
to steineggerlab/foldseek
that referenced
this issue
Jan 7, 2025
35537c46 Make sure cuda binaries do not depend on dynamic libatomic 7e2732cd Readd tweaked hack to remove GLIBC_PRIVATE symbols 8bf7c5e6 Debug glibc check for GPU builds 6e46b5e2 Terminate unpadded sequences with \n\0 e6f0328b Fully revert cmake version string change 64f03d46 Next try with build system cleanup e840263e Forgot wget 5140ceb0 Fix build system breakage 9927445c Sync build system changes with foldseek changes relicensed as MIT for MMseqs2 b0e91c12 Fix soedinglab/MMseqs2#912 git-subtree-dir: lib/mmseqs git-subtree-split: 35537c46a00c33db96409ce6aea42a42224f7917
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Expected Behavior
For the GPU version, when running easy-search mode and using --format-output with "tseq" to get the sequences for the hits, the amino acid sequences should be printed properly.
Current Behavior
Instead, the amino acid sequence appears as a bunch of other characters (see below).
Steps to Reproduce (for bugs)
Comandline: mmseqs easy-search $INPUT.fasta /mnt/ephemeral/dbmm/nr_gpu RESULT /mnt/ephemeral/tmp2 --gpu 1 --num-iterations 3 -s 8 --max-seqs 999999 --format-mode 4 --format-output "query,target,evalue,fident,nident,qstart,qend,qlen,tstart,tend,tlen,alnlen,bits,qcov,tcov,tseq"
MMseqs Output (for bugs)
(QUERY and TARGET anonymized)
query target evalue fident nident qstart qend qlen tstart tend tlen alnlen bits qcov tcov tseq
QUERY TARGET 7.770E-159 1.000 238 1 238 238 1 238 238 238 504 1.000 1.000
^O^H^E^C^C ^D^P^E^Q^Q^L^G ^Q^C ^B^E^B^Q^K^E^F^H^D^O^Q^O^E^C^E^C^E^B^@^P^S^E^H ^P ^D^G^A^P^P^E^H ^L^Q^L^R^L^P ^Q^P^P^D^O^S^E^Q^M^A^D^O^N^S^L^B^F
^H^M^F^B^D^D^H^O^@
^L^C^E^S^Q^M^C^N^P^G^D^D^H^B^B^E^K^S^H^P^N^@^C^Q^H^D^C^E^B^P ^Q^K^N^G^C ^E^G^B^D^H^C^B^E^K^G ^E^F^H ^C^S^K^S^K^O^F^K^Q^S^G
^@^B^H^M^H^K^E^G^H^Q^K^D^H^G^N^F^K^G^C^B^E^O^Q^M ^@^B^F^S^M^K^P^L^G^E^B^E^L^Q ^L^B^K^F^S ^O^P^M^O^@ ^O^H^B^L^K^C^H^N^B^F
^Q ^C^D^Q^P^@^@^E^G^P^F^E
^B^C ^S^H
Your Environment
Include as many relevant details about the environment you experienced the bug in.
** 59016d2
** compiled binary provided by soedinglab on mmseqs2 website
** EC2 instance type g5.12xlarge (192GB memory, 4x A10 GPU with 24GB RAM a piece)
** Ubuntu 20.04.6 LTS (GNU/Linux 5.15.0-1072-aws x86_64)
The text was updated successfully, but these errors were encountered: