Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Get input file error #4382

Open
OZTaekOppa opened this issue Aug 22, 2024 · 5 comments
Open

Get input file error #4382

OZTaekOppa opened this issue Aug 22, 2024 · 5 comments

Comments

@OZTaekOppa
Copy link

OZTaekOppa commented Aug 22, 2024

1. What were you trying to do?

Dear vg team,

Thank you for the great program.

FYI,
minigraph v0.21 (https://github.com/lh3/minigraph)
vg v1.56.0 (https://github.com/vgteam/vg)
vcfbub (https://github.com/pangenome/vcfbub)
Input file: Followed PanSN-spec: Pangenome Sequencing Naming (https://github.com/pangenome/PanSN-spec)

I encountered an issue while testing the minigraph GFA file using HPRC in the vg deconstruct step.

2. What did you want to happen?
Ensure LV annotations using vg deconstruct

3. What actually happened?

  • vg deconstruct -e -a '#' -P chm13 --snarls /data/minigraph_run/chm13_t2tctg_mgout.snarls /data/minigraph_run/chm13_t2tctg_mgout.gfa
    error:[get_input_file_name] unable to open input file: #
    error[VPKG::load_one]: Could not open # to determine file type

4. If you got a line like Stack trace path: /somewhere/on/your/computer/stacktrace.txt, please copy-paste the contents of that file here:

Executed command:
## Step 6: Ensure LV annotations using vg deconstruct
vg snarls ${OUTPUT_DIR}/chm13_t2tctg_mgout.gfa > ${OUTPUT_DIR}/chm13_t2tctg_mgout.snarls
vg deconstruct -e -a ‘#’ -P chm13 --snarls ${OUTPUT_DIR}/chm13_t2tctg_mgout.snarls ${OUTPUT_DIR}/chm13_t2tctg_mgout.gfa > ${OUTPUT_DIR}/chm13_t2tctg_mgout.sv.lv.vcf

## Step 7: Convert bgzip of vcf
bgzip -c ${OUTPUT_DIR}/chm13_t2tctg_mgout.sv.lv.vcf > ${OUTPUT_DIR}/chm13_t2tctg_mgout.sv.lv.vcf.gz
tabix -p vcf  ${OUTPUT_DIR}/chm13_t2tctg_mgout.sv.lv.vcf.gz

## Step 8: Remove large (> 10Mb) spurious DELsin MC & PGGB graphs
singularity exec /singularityimg/pggb_latest.sif vcfbub -l 0 -r 10000000 -i ${OUTPUT_DIR}/chm13_t2tctg_mgout.sv.lv.vcf.gz > ${OUTPUT_DIR}/chm13_t2tctg_mgout.sv.lv.filterd.vcf.gz


In Step 6, although the -H option has been deprecated and that was fine, the issue lies with the # symbol. I tried using variations such as '\#', "#", and "\#", but all failed. The error message I received was:


+ vg deconstruct -e -a '\#' -P chm13 --snarls /data/minigraph_run/chm13_t2tctg_mgout.snarls /data/minigraph_run/chm13_t2tctg_mgout.gfa
error:[get_input_file_name] unable to open input file: \#
error[VPKG::load_one]: Could not open \# to determine file type

5. What data and command can the vg dev team use to make the problem happen?
Human pangenome minigraph

6. What does running vg version say?
v1.56.0

Place vg version output here
@jeizenga
Copy link
Contributor

It looks like you are trying to provide # as an argument to -a, but that option doesn't take an argument. Because of that, # is being interpreted as a positional argument. The only positional argument that vg deconstruct takes is the graph itself, so it's trying to open # as the graph instead of ${OUTPUT_DIR}/chm13_t2tctg_mgout.gfa.

@OZTaekOppa
Copy link
Author

@jeizenga

Thank you for your reply.

Following your suggestions, I used these two commands:

Command 1:
vg deconstruct -e -a -t 4 --snarls ${OUTPUT_DIR}/chm13_t2tctg_mgout.snarls ${OUTPUT_DIR}/chm13_t2tctg_mgout.gfa > ${OUTPUT_DIR}/chm13_t2tctg_mgout.sv.lv.vcf

Command 2:
vg deconstruct -e -a -t 4 -P chm13 --snarls ${OUTPUT_DIR}/chm13_t2tctg_mgout.snarls ${OUTPUT_DIR}/chm13_t2tctg_mgout.gfa > ${OUTPUT_DIR}/chm13_t2tctg_mgout.sv.lv.vcf

Both commands only generated meta-information lines. From the header and data lines, only the first header line was produced in the chm13_t2tctg_mgout.sv.lv.vcf file.

image

image

Did I miss something?

Kind regards,

Taek

@jeizenga
Copy link
Contributor

I think I'll redirect this question to @glennhickey

@glennhickey
Copy link
Contributor

You can't run vg deconstruct on minigraph output because it doesn't have the (non-reference) paths embedded in it. vg deconstruct needs the path information to work.

@OZTaekOppa
Copy link
Author

@glennhickey, thanks for your reply. I will get back to you after testing it again with the PGGB gfa files.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants