-
Notifications
You must be signed in to change notification settings - Fork 128
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adds support for compressed strain name files #730
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, @benjaminotter! There were a couple of subtle bugs that I noted in the comments below. I'll add fixes for these and push a rebased version of all changes into a single commit. Then I think we're good to merge.
augur/utils.py
Outdated
@@ -25,18 +27,6 @@ class AugurException(Exception): | |||
|
|||
|
|||
@contextmanager |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to delete this line, too, so we don't turn is_vcf
into a context manager.
augur/utils.py
Outdated
@@ -713,7 +703,7 @@ def read_strains(*files, comment_char="#"): | |||
""" | |||
strains = set() | |||
for input_file in files: | |||
with open(input_file, 'r', encoding='utf-8') as ifile: | |||
with open_file(input_file, 'r', encoding='utf-8') as ifile: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have to drop the encoding
keyword argument here because xopen
does not support this. This is more evidence in support of implementing our own compression backend.
Uses the new `open_file` from the `io` module to open strains files and thereby support compressed inputs. Removes the unused older `open_file` function in `utils.py`. Fixes #722
38e43c8
to
9246285
Compare
Codecov Report
@@ Coverage Diff @@
## master #730 +/- ##
==========================================
+ Coverage 31.51% 31.53% +0.02%
==========================================
Files 41 41
Lines 5674 5666 -8
Branches 1373 1371 -2
==========================================
- Hits 1788 1787 -1
+ Misses 3812 3805 -7
Partials 74 74
Continue to review full report at Codecov.
|
Description of proposed changes
Replaces the
open
call in theread_strains
function withaugur.io.open_file
to support compressed strain name files.Related issue(s)
Fixes #722
Testing
Tested with compressed
include.txt
andexclude.txt
files from the ncov workflow