Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

seqtk telo can't find results with my data #215

Open
ksc289 opened this issue Jun 26, 2024 · 6 comments
Open

seqtk telo can't find results with my data #215

ksc289 opened this issue Jun 26, 2024 · 6 comments

Comments

@ksc289
Copy link

ksc289 commented Jun 26, 2024

Hi :)
I have used the following command:
seqtk telo assembly.fa > telo.bed 2 > telo.count

I instantly get two files: telo.bed (empty) and telo.count which shows me the following:
more telo.count
0 1272251157

Why can't you find telomeres in my assembly?

Thank you so much,
Karen.

@shenwei356
Copy link

seqtk telo assembly.fa > telo.bed 2 > telo.count

I think there should be no space between 2 and >.

Why can't you find telomeres in my assembly?

It could be in a more polite way, like "Why's no telomere repeat found?".

The reason might simply be no targets are detected in the current algorithm and parameters. I also tested with a T2T human chromosome, and it worked.

@arslan9732
Copy link

arslan9732 commented Jul 9, 2024

Try with other telomeric motifs. It is not necessary that the default motif (TTAGGG) is present in your assembly. You can try followings:
CCCTAAA
AAATCCC
TAGGC
....

@Sung-hub
Copy link

Could you tell me how to choose/switch telomeric motif sequence (I want to search) other than default motif?

@arslan9732
Copy link

@Sung-hub
seqtk telo -m CCCTAAA genome.fa

@Sung-hub
Copy link

Thank you for your response! I tried some possible telomeric sequence and I did not get any hit.
[sung.shin@arsnecla0ap2 200088meta_Tangled_contigs]$ seqtk telo -m CCCCAAT all_tangel_200088.fasta
0 374789249
[sung.shin@arsnecla0ap2 200088meta_Tangled_contigs]$ seqtk telo -m CCCAATC all_tangel_200088.fasta
0 374789249
[sung.shin@arsnecla0ap2 200088meta_Tangled_contigs]$ seqtk telo -m AAAT all_tangel_200088.fasta
0 374789249
[sung.shin@arsnecla0ap2 200088meta_Tangled_contigs]$ seqtk telo -m AAAATAAA all_tangel_200088.fasta
0 374789249
[sung.shin@arsnecla0ap2 200088meta_Tangled_contigs]$ seqtk telo -m TGGGGAT all_tangel_200088.fasta
0 374789249

I know those sequences are frequently in the assembly, but they might not be exactly at the end of contig (mostly several hundred bases inside). Does the location of telomeric sequence can be problem to identify with seqtk telo?

@arslan9732
Copy link

seqtk Only look at the end of the sequence. Manually look at your sequence's ends. Most probably you lost telomeric sequences during scaffolding (which usually happened).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants