Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Taxonomy Check error #336

Open
Leobpfbac opened this issue Feb 23, 2025 · 3 comments
Open

Taxonomy Check error #336

Leobpfbac opened this issue Feb 23, 2025 · 3 comments

Comments

@Leobpfbac
Copy link

Hi,
I installed PGAP, tested it with the suggested genome and everything worked as expected.
I have a de novo assembled genome and ran PGAP just for --taxcheck but it didn't seem to work. I ran another time using --ignore-all-error and it worked. Could you help me identify what the problem might be? Here is the cwltool.log file of the first run.

Thank you

cwltool.log

@azat-badretdin
Copy link
Contributor

Thank you for your request, user @Leobpfbac

The general rule of interpreting cwltool.log output is to search for the first occurrence of permanentFail then scroll up to see the first error after the first command line echo message like this:

....$ executable \
continuation \
of command \
line

In this case, it's

  <message tool="fastaval" severity="ERROR" seq_id="58" code="SEQ_SHORT_LENGTH" fasta_seq_id="lcl|58">Sequence is shorter than 200 nucleotides</message>
  <message tool="fastaval" severity="ERROR" seq_id="59" code="SEQ_SHORT_LENGTH" fasta_seq_id="lcl|59">Sequence is shorter than 200 nucleotides</message>
  <message tool="fastaval" severity="ERROR" seq_id="60" code="SEQ_SHORT_LENGTH" fasta_seq_id="lcl|60">Sequence is shorter than 200 nucleotides</message>
  <message tool="fastaval" severity="ERROR" seq_id="61" code="SEQ_SHORT_LENGTH" fasta_seq_id="lcl|61">Sequence is shorter than 200 nucleotides</message>
  <message tool="fastaval" severity="ERROR" seq_id="62" code="SEQ_SHORT_LENGTH" fasta_seq_id="lcl|62">Sequence is shorter than 200 nucleotides</message>

We accept only contigs with length larger than 200 nucleotides for official submissions to GenBank, so your choice of running this with --ignore-all-errors is a valid choice if you use this annotation for your purposes, but in case you want to submit it to Genbank, you will have to somehow get rid of short contigs.

@Leobpfbac
Copy link
Author

Oh, you're right.
I'll work on this and run pgap later.
Thanks @azat-badretdin

@azat-badretdin
Copy link
Contributor

Happy landings, user @Leobpfbac !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants