Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUSCO (Benchmarking Universal Single-Copy Orthologs) fails with isONform output #13

Open
NStrowbridge opened this issue Oct 26, 2023 · 3 comments

Comments

@NStrowbridge
Copy link

NStrowbridge commented Oct 26, 2023

As the title indicates, I recently used isONform full pipeline on data generated using ONT PCR cDNA [SQK-PCS109) kit. Using the output "transcriptome.fastq" with BUSCO gives me the error message "ERROR: The input file does not contain nucleotide sequences.", BUSCO must not be recognizing the output as a transcriptome. I did notice when looking directly at the transcriptome file that the nucleotide sequence is followed by "+ +++++++++++++ (repeating)", which I assume is QC results? I'm not sure if this has anything to do with it.

If you could point me in the right direction for getting this issue sorted, or perhaps suggesting other QC steps for the de-novo transcriptome that would be greatly appreciated.

Kind regards,

Nic Strowbridge, MSc

@NStrowbridge
Copy link
Author

NStrowbridge commented Oct 26, 2023

Nevermind realized my mistake! Converted from fastq to fasta format, is now working! However, I am still interested if you have advice for further QC steps for de-novo transcriptomes

@ksahlin
Copy link
Collaborator

ksahlin commented Oct 27, 2023

Hi Nic,

You are correct.

On that note, @aljpetri we should output fasta files of the transcriptome instead of fastq (quality values are long gone by this stage).

@aljpetri
Copy link
Owner

Hi,
how did the BUSCO analysis go? Please feel free to give feedback should you find anything odd.
I have now rolled out a new release for isONform that outputs fasta files as a standard instead of fastq.
Concerning other QC steps for transcriptomes: This depends mainly on which data you have available. Should you have a reference for the organism you are interested in you could try to run SQANTI to learn more about the transcriptome. If you do not have any additional data available I do not know of any QC steps that you could perform.
Best,
Alex

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants