Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ERROR - query end inconsistent when running parse #45

Open
asherbryant opened this issue Jul 24, 2023 · 7 comments
Open

ERROR - query end inconsistent when running parse #45

asherbryant opened this issue Jul 24, 2023 · 7 comments

Comments

@asherbryant
Copy link

Hello!

We are analyzing COLO829 bams (fastqs from the Valle-Inclan truthset paper that were realigned with minimap2) & getting the following errors:

07/24/2023 03:17:41 - nanomonsv.parse - ERROR - query end inconsistent!! ERR2752452.116: 18078 != 5227
07/24/2023 03:17:41 - nanomonsv.parse - ERROR - query end inconsistent!! ERR2752451.4333814: 5045 != 4714

Below is the script:

#!/bin/bash

# bam files
TUMOR_BAM=/data/Lab/colo829_truthset/colo829_tumor_truthset_minimap2.bam
NORMAL_BAM=/data/Lab/colo829_truthset/colo829_normal_truthset_minimap2.bam

# reference files
HG=/data/Lab/references/human_g1k_v37.fasta

# parameters
SIZE=30
DIRECTORY=/data/Lab/bm/truthset_30bp


# source conda init file
source myconda

# nanomonsv
mkdir $DIRECTORY/nanomonsv
mamba activate nanomonsv
#-----parse-----
nanomonsv parse $TUMOR_BAM $DIRECTORY/nanomonsv/nanomonsv_tumor
nanomonsv parse $NORMAL_BAM $DIRECTORY/nanomonsv/nanomonsv_normal
#----get----
nanomonsv get $DIRECTORY/nanomonsv/nanomonsv_tumor $TUMOR_BAM \
$HG \
--control_prefix $DIRECTORY/nanomonsv/nanomonsv_normal --control_bam $NORMAL_BAM \
--use_racon \
--single_bnd \
--min_indel_size $SIZE \
--debug

Any help would be appreciated!

@friend1ws
Copy link
Owner

Hi, This data should be OK (because we have actively used this dataset).
Could you just in case show me the header describing the alignment options bu e.g., the following command?

samtools view -H ERR2752452.bam | grep PG

@asherbryant
Copy link
Author

Thank you for the reply!

Header for colo829_tumor_truthset_minimap2.bam:

@PG	ID:minimap2	PN:minimap2	VN:2.26-r1175	CL:minimap2 -ax map-ont --eqx -t 30 /data/Lab/references/human_g1k_v37.fasta /data/Lab/colo829_truthset/fastq/ERR2752452.fastq.gz
@PG	ID:samtools	PN:samtools	PP:minimap2	VN:1.17	CL:samtools sort -@4 -m 4G
@PG	ID:samtools.1	PN:samtools	PP:samtools	VN:1.17	CL:samtools view -H colo829_tumor_truthset_minimap2.bam

I also get a similar for COLO829 made publicly available by Pacbio.

07/25/2023 17:57:23 - nanomonsv.parse - ERROR - query end inconsistent!! m84039_230312_025934_s1/227480300/ccs: 13586 != 1900
07/25/2023 17:57:24 - nanomonsv.parse - ERROR - query end inconsistent!! m84039_230327_233730_s2/157488965/ccs: 909 != 628

Script:

#!/bin/bash

# bam files
TUMOR_BAM=/data/Lab/pacbio/colo829/COLO829.GRCh38.bam
NORMAL_BAM=/data/Lab/pacbio/colo829/COLO829-BL.GRCh38.bam

# reference files
HG=/data/Lab/references/GCA_000001405.15_GRCh38_no_alt_analysis_set.fna

# parameters
DIRECTORY=/data/Lab/severus_benchmarking/pacbio


# source conda init file
source myconda

# nanomonsv
mkdir $DIRECTORY/nanomonsv
mamba activate nanomonsv
#-----parse-----
nanomonsv parse $TUMOR_BAM $DIRECTORY/nanomonsv/nanomonsv_tumor
nanomonsv parse $NORMAL_BAM $DIRECTORY/nanomonsv/nanomonsv_normal
#----get----
nanomonsv get $DIRECTORY/nanomonsv/nanomonsv_tumor $TUMOR_BAM \
$HG \
--control_prefix $DIRECTORY/nanomonsv/nanomonsv_normal --control_bam $NORMAL_BAM \
--use_racon \
--single_bnd \
--min_indel_size $SIZE \
--debug

Header for COLO829.GRCh38.bam:

samtools view -H COLO829.GRCh38.bam | grep PG
@PG	ID:ccs	PN:ccs	VN:7.0.0 (commit v7.0.0-7-g0c7f1adf)	DS:Generate circular consensus sequences (ccs) from subreads.	CL:/opt/pacbio/tag-ccs-current/bin/ccs --streamed --log-level INFO --stderr-json-log --kestrel-files-layout --movie-name m84039_230312_025934_s1 --log-file metadata/m84039_230312_025934_s1.ccs.log --min-rq 0.9 --non-hifi-prefix fail --knrt-ada --pbdc-model /opt/pacbio/tag-ccs-current/bin/../models/revio_v1.onnx --alarms metadata/m84039_230312_025934_s1.ccs.alarms.json
@PG	ID:lima	VN:2.7.1 (commit v2.7.1-1-gf067520)	CL:/opt/pacbio/tag-lima-current/bin/lima --movie-name m84039_230312_025934_s1 --kestrel-files-layout --quality hifi --output-missing-pairs --shared-prefix --hifi-preset SYMMETRIC-ADAPTERS --store-unbarcoded --split-named --reuse-source-uuid --reuse-biosample-uuids --stderr-json-log --alarms metadata/m84039_230312_025934_s1.hifi_reads.lima.alarms.json --log-file metadata/m84039_230312_025934_s1.hifi_reads.lima.log pb_formats/m84039_230312_025934_s1.hifi_reads.consensusreadset.primrose.xml metadata/m84039_230312_025934_s1.barcodes.fasta hifi_reads/m84039_230312_025934_s1.hifi_reads.demux.bam
@PG	ID:pbmm2	PN:pbmm2	VN:1.10.0 (commit v1.10.0)	CL:pbmm2 align /pbi/dept/secondary/siv/smrtlink/smrtlink-appslabht/jobs-root/cromwell-executions/sl_import_fasta/5da8287b-e356-4510-a856-81a98e63b351/call-fasta_to_reference/execution/smrtlink_reference.referenceset.xml /pbi/dept/secondary/siv/smrtlink/smrtlink-appslabht/jobs-root/cromwell-executions/pb_align_ccs/8ce15aa8-502a-4815-b4f0-438794f3692e/call-prepare_input/prepare_input/ad1ca5d2-00af-47e3-8b0c-1b29d11a5884/call-dataset_filter/execution/filtered.consensusreadset.xml mapped.consensusalignmentset.xml --sort --min-gap-comp-id-perc 70.0 --min-length 50 --sample COLO829 --preset HiFi -j 16 --log-level INFO --log-file pbmm2.log --alarms alarms.json
@PG	ID:primrose	VN:1.4.0 (commit v1.3.0-14-gd9f6a32)	CL:/opt/pacbio/tag-primrose-current/bin/primrose --movie-name m84039_230312_025934_s1 --kestrel-files-layout --quality hifi --reuse-source-uuid --stderr-json-log --log-file metadata/m84039_230312_025934_s1.hifi_reads.primrose.log --alarms metadata/m84039_230312_025934_s1.hifi_reads.primrose.alarms.json
@PG	ID:samtools	PN:samtools	PP:primrose	VN:1.17	CL:samtools view -H COLO829.GRCh38.bam

@friend1ws
Copy link
Owner

Thanks!
nanomonsv cannot tread BAM files produced with the --eqx options.
Is it important for other purposes? Then, I may consider supporting it.

@asherbryant
Copy link
Author

Yes, It would be helpful for us if you could support --eqx option & Pacbio's official minimap2 wrapper, pbmm2. Thank you!

@xwBio
Copy link

xwBio commented Sep 8, 2023

Hi,

I also have a similar issue and it seems that the current version is still unable to effectively handle PacBio BAM files generated with --eqx option.

It would be appreciated If you could resolve this issue.

Best,
Xiwei

@friend1ws
Copy link
Owner

OK. I will try to solve this problem hopefully in the next version.

@xwBio
Copy link

xwBio commented Sep 8, 2023

OK. I will try to solve this problem hopefully in the next version.

Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants