Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mapping using different parameters --very-sensitive and default #430

Open
AllanOkwaro opened this issue May 14, 2024 · 0 comments
Open

Mapping using different parameters --very-sensitive and default #430

AllanOkwaro opened this issue May 14, 2024 · 0 comments

Comments

@AllanOkwaro
Copy link

Dear developers,

I have mapped some reads using hisat2 and getting different results when I tweak the mapping parameters by changing the default preset --sensitive parameter to --very-sensitive The difference in mapping rate is huge, with sequences having a mapping rate of 50% suddenly having a mapping rate of over 80%. I am not sure whether this is what the software is supposed to do or if the --very-sensitive parameter increases the number of mismatches.
my codes are as below. For default settings, I use
`#!/bin/bash

Load the HiSAT2 module if needed

module load bio/hisat2/2.2.1

Create the output directory if it doesn't exist

mkdir -p hisatout

Iterate over files ending with "_1.fq.gz" in the current directory

for forward_read_file in *_1.fq.gz; do
# Extract file basename without extension
file_basename="${forward_read_file%%_1.fq.gz}"

# Define output files
output_sam="hisatout/${file_basename}.sam"
report_file="hisatout/${file_basename}.report"
reverse_read_file="${file_basename}_2.fq.gz"

# Run HiSAT2 with specified options
hisat2 -p 64 \
       -x mbel.index \
       -1 "$forward_read_file" \
       -2 "$reverse_read_file" \
       -S "$output_sam" \
       --summary-file "$report_file" &

done

Wait for all background processes to finish

wait
`
For the --very-sensitive parameter, I used

`#!/bin/bash

Load the HiSAT2 module if needed

module load bio/hisat2/2.2.1

Create the output directory if it doesn't exist

mkdir -p hisatout

Iterate over files ending with "_1.fq.gz" in the current directory

for forward_read_file in *_1.fq.gz; do
# Extract file basename without extension
file_basename="${forward_read_file%%_1.fq.gz}"

# Define output files
output_sam="hisatout/${file_basename}.sam"
report_file="hisatout/${file_basename}.report"
reverse_read_file="${file_basename}_2.fq.gz"

# Run HiSAT2 with specified options
hisat2 -p 64 \
       -x mbel.index \
       -1 "$forward_read_file" \
       -2 "$reverse_read_file" \
       -S "$output_sam" \
       --very-sensitive \
       --summary-file "$report_file" &

done

Wait for all background processes to finish

wait
The only difference is that one code uses the default hisat2 preset setting while in the second one I used the--very-sensitive` parameter.

The output files are quite different

very sensitive
29691907 reads; of these: 29691907 (100.00%) were paired; of these: 13836874 (46.60%) aligned concordantly 0 times 15535622 (52.32%) aligned concordantly exactly 1 time 319411 (1.08%) aligned concordantly >1 times ---- 13836874 pairs aligned concordantly 0 times; of these: 19384 (0.14%) aligned discordantly 1 time ---- 13817490 pairs aligned 0 times concordantly or discordantly; of these: 27634980 mates make up the pairs; of these: 26189715 (94.77%) aligned 0 times 1358482 (4.92%) aligned exactly 1 time 86783 (0.31%) aligned >1 times 55.90% overall alignment rate

default
29691907 reads; of these: 29691907 (100.00%) were paired; of these: 15713848 (52.92%) aligned concordantly 0 times 13723023 (46.22%) aligned concordantly exactly 1 time 255036 (0.86%) aligned concordantly >1 times ---- 15713848 pairs aligned concordantly 0 times; of these: 136412 (0.87%) aligned discordantly 1 time ---- 15577436 pairs aligned 0 times concordantly or discordantly; of these: 31154872 mates make up the pairs; of these: 29552270 (94.86%) aligned 0 times 1504759 (4.83%) aligned exactly 1 time 97843 (0.31%) aligned >1 times 50.24% overall alignment rate

My question then is, how do I proceed from here ? Do I use the default settings or use the --very-sensitive parameter. Still, from our lab, when a colleague changes this two parameters from default to very sensitive the overall mapping rates moves shoots 18% to 90%, which I find a bit off.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant