Prevent Bad Error if 'vcftools' not Present #321
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Brought to attention by issue #320
If working with VCF data (ex: TB tutorial), if
vcftools
is not installed, a file with one blank line is output fromaugur filter
(first step of tutorial) asfiltered.vcf.gz
.augur mask
then tries to process this. However, the error that's produced is:UnboundLocalError: local variable 'chromName' referenced before assignment
Which gives absolutely no information as to what the original problem was (in the previous step!).
I added some checks for existing and empty files in
mask.py
but this actually doesn't solve this problem, as the file isn't 'empty' (size 0).The problem is caused because the call is something like:
vcftools --remove-indv G22696 --gzvcf data/lee_2015.vcf.gz --recode --stdout | gzip -c > results/filtered.vcf.gz
When the
vcftools
part fails, thegzip
part goes ahead and zips up nothing, creating the one-blank-line file.Interestingly, this is not caught by
utils.py
functionrun_shell_command
(which is what puts the call through). This function usessubprocess.check_call
to check the call is successful - but apparently this does not catch/bin/sh: 1: vcftools: not found
. (I could not find much info about this online; it proved hard to google.)It seemed the simplest way to address this was just to check if
vcftools
was installed (if using VCF data) at the beginning offilter
and give the user a useful error message about it. I did this withwhich
fromshutil
.It might be preferable to implement this somehow in
run_shell_command
in a more general way, so that any external program call is checked - but I was unsure whether we could reliably break off the first bit of the command sent torun_shell_command
as the part to check.