-
Notifications
You must be signed in to change notification settings - Fork 72
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Crash reading valid but weird VCF #187
Comments
just use: you might work if you open with |
I think this may be something very insidious going on here. The following works: import cyvcf2
print(f"version: {cyvcf2.__version__}")
fh = open("./bad_vcf.vcf")
reader = cyvcf2.Reader(fh)
for v in reader:
print(v)
print(f"gt_types: {v.gt_types}") I am not too familiar with CPython vs. Python, but could the fact we cache the file handle returned by |
Some quick googling found maybe a red herring, or maybe not: https://stackoverflow.com/a/8011863 |
oh, good catch, we probably need a weakref to the file handle. |
can also use a :
around the entire script so it doesn't get collected |
@brentp where do you store the weakref? VCF? HtsFile? |
I think in the htsfile makes sense but don't have a strong preference. |
Hi, I'm getting strange genotype calls and then a crash with my GATK VCF.
The zygosity call is strange, but the VCF passes validation.
I've reduced it down to a few lines (VCF at bottom of issue), so you can reproduce the issue.
Output:
Issue 1 - crash:
I think it's uninitialised memory as sometimes you get a different exception, ie:
Issue 2
Dunno what GATK is doing here, but ALT is ".", and GT = ".|."
cyVCF calls the genotype as 3 (HOM ALT) - I think it should call this as 2 (UNKNOWN)
VCF
The text was updated successfully, but these errors were encountered: