Subseq extracting reads with query name list failed #169

yangyxt · 2021-03-02T17:12:14Z

I used the seqtk/1.3 version and I use subseq <in.fq> <name.lst> to extract reads from fq file and it failed.

The in.fq is a simulated file so the query name is with an ascending ID number:

I use awk to confirm that the query names specified by name.lst file are existed in fq file. Then I tried to extract the first several reads, using a file stored the names of sim_sample_1_chr7-chr7-1, sim_sample_1_chr7-chr7-3, sim_sample_1_chr7-chr7-5(one name per line). And it worked. But if I chose a query name ranked far behind in the fq file, the extraction carried by seqtk subseq failed!

Upon my test, If I try to fetch read before query name sim_sample_1_chr7-chr7-343063, it all works well. Any query name comes behind this failed to be extracted.

Here I show u an example, First a screenshot of a test name.lst:

(I assure u every query name in this list exist in the in.fq file, confirmed by awk)

Then a screenshot of the extracted sequences by commanding seqtk subseq in.fq name.lst | less -S -

I was so confused why this happened?! Does seqtk only read a part of fq file into memory for inspection? Pls help take a look at this issue at ur convenience. Much appreciated.

The text was updated successfully, but these errors were encountered:

yangyxt · 2021-03-03T03:57:43Z

I just used seqtk seq to view the same fastq file and found that it ends at the query name sim_sample_1_chr7-chr7-343063. Why there is a line limit here for seqtk to inspect on data, I don't see any introduction on the manual about this limit and any argument I can use to remove this restriction.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Subseq extracting reads with query name list failed #169

Subseq extracting reads with query name list failed #169

yangyxt commented Mar 2, 2021

yangyxt commented Mar 3, 2021

Subseq extracting reads with query name list failed #169

Subseq extracting reads with query name list failed #169

Comments

yangyxt commented Mar 2, 2021

yangyxt commented Mar 3, 2021