-
Notifications
You must be signed in to change notification settings - Fork 128
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Move filter
date parsing logic to dates
, use in frequencies
#890
Conversation
The multi-line string is an unnecessary artifact of a longer string used while working on 873a70a.
d2be0ba
to
e12e941
Compare
Codecov Report
@@ Coverage Diff @@
## master #890 +/- ##
=======================================
Coverage 34.64% 34.64%
=======================================
Files 41 42 +1
Lines 6003 6003
Branches 1536 1536
=======================================
Hits 2080 2080
Misses 3850 3850
Partials 73 73
Continue to review full report at Codecov.
|
@huddlej I think this should work but unsure what's the best way to test the --min-date/--max-date feature of frequencies. I've never used it directly and not familiar with the math going on within. Are you able to come up with a small snippet for test_frequencies.py? I can expand from that. |
d12f0b5
to
5df2749
Compare
to be more in line with #890 - define supported date formats via exhaustive list of regular expressions - add TODO for relative dates - factor out get_date_min/max() into any_to_numeric()/iso_to_numeric() - do sqlite3 UDF handling in try_get_numeric_date_min/max() - remove date_type()/valid_date() functions, replace with any_to_numeric_type_min/max() - this ensures args.min_date/max_date are numeric
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These changes look great, @victorlin. I like how simple and reusable the new date_parsing.py
module. I'll work on some tests for the frequencies module that rely on the this new interface. In the mean time, I had a comment threaded below where I propose a different organization of this new date-handling logic. We can chat in person about this and/or as a broader group, though, to see how others feel about the idea of a top-level dates
module. No need to change anything until I get tests written.
@victorlin Since this PR only changes the command line interface to the |
With #740, filter's numeric_date() provides support for 3 date formats. However, supporting various date formats is not specific to filter (e.g. frequencies has a separate numeric_date() which is now out-of-sync with this filter's numeric_date()). This commit: 1. Moves numeric_date() to a new submodule augur.dates 2. Moves the related SUPPORTED_DATE_HELP_TEXT to augur.dates 3. Updates numeric_date() to raise a TypeError rather than argparse.ArgumentTypeError so it can be generalized to non-argparse usage 4. Adds a new function numeric_date_type() which wraps numeric_date() and raises an argparse.ArgumentTypeError per #740 (comment)
Reasons: 1. Sync-up with new filter --min-date/--max-date features (support for relative dates) 2. Less code duplication
Copies a functional test for augur frequencies from the Zika functional tests and expands on this test with two additional tests of the min/max date range arguments using fixed and relative dates.
3413c81
to
d650f04
Compare
Just tested this with the ncov workflow using the following build config and all worked as expected: inputs:
- name: botswana-data
metadata: data/gisaid_auspice_input_hcov-19_2022_03_29_17.tar
sequences: data/gisaid_auspice_input_hcov-19_2022_03_29_17.tar
- name: references
metadata: data/references_metadata.tsv
sequences: data/references_sequences.fasta
builds:
botswana:
subsampling_scheme: recent
filter:
skip_diagnostics: true
min_length: 5000
subsampling:
recent:
recent_focal:
min_date: --min-date 1M
older_focal:
min_date: --min-date 2M
max_date: --max-date 1M
frequencies:
min_date: 2M |
filter
date parsing logic to utils
, use in frequencies
filter
date parsing logic to dates
, use in frequencies
Description of proposed changes
See commit messages
Related issue(s)
--min-date
and--max-date
#886Testing
frequencies