Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support thresholded titer values #1118

Merged
merged 2 commits into from
Dec 29, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions CHANGES.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,11 @@

## __NEXT__

### Features

* titers: Support parsing of thresholded values (e.g., "<80" or ">2560"). [#1118][] (@huddlej)

[#1118]: https://github.com/nextstrain/augur/pull/1118

## 19.2.0 (19 December 2022)

Expand Down
48 changes: 44 additions & 4 deletions augur/titer_model.py
Original file line number Diff line number Diff line change
Expand Up @@ -41,22 +41,51 @@ def load_from_file(filenames, excluded_sources=None):
<class 'dict'>
>>> len(measurements)
11
>>> measurements[("A/Acores/11/2013", ("A/Alabama/5/2010", "F27/10"))]
[80.0]
>>> len(strains)
13
>>> len(sources)
5

Inspect specific measurements. First, inspect a measurement that has a
specific value in the input.

>>> measurements[("A/Acores/11/2013", ("A/Alabama/5/2010", "F27/10"))]
[80.0]

Next, inspect a measurement that has a thresholded value at the lower
bound of detection (e.g., "<80"). This measurement should be reported as
one half of its threshold value (e.g., 40.0).

>>> measurements[("A/Acores/11/2013", ("A/Victoria/208/2009", "F7/10"))]
[40.0]

Inspect a measurement that has a thresholded value at the upper bound of
detection (">1280"). This measurement should be reported as twice its
threshold value (e.g., 2560.0).

>>> measurements[("A/Acores/SU43/2012", ("A/Texas/50/2012", "F36/12"))]
[2560.0]

Confirm that excluding sources produces fewer measurements.

>>> measurements, strains, sources = TiterCollection.load_from_file("tests/data/titer_model/h3n2_titers_subset.tsv", excluded_sources=["NIMR_Sep2013_7-11.csv"])
>>> len(measurements)
5

Request measurements for a test/reference/serum tuple that should not
exist after excluding its source.

>>> measurements.get(("A/Acores/11/2013", ("A/Alabama/5/2010", "F27/10")))
>>>

Missing titer data should produce an error.

>>> output = TiterCollection.load_from_file("tests/data/titer_model/missing.tsv")
Traceback (most recent call last):
File "<ipython-input-2-0ea96a90d45d>", line 1, in <module>
open("tests/data/titer_model/missing.tsv", "r")
FileNotFoundError: [Errno 2] No such file or directory: 'tests/data/titer_model/missing.tsv'

"""
if excluded_sources is None:
excluded_sources = []
Expand All @@ -70,10 +99,21 @@ def load_from_file(filenames, excluded_sources=None):
with open_file(fname, 'r') as infile:
for line in infile:
entries = line.strip().split('\t')
titer = entries[4]
try:
val = float(entries[4])
except:
# Convert values below or above the measurement
# threshold (e.g., "<80" or ">2560") to half or twice
# their thresholded value, respectively, so they can be
# included in models instead of being discarded.
if titer.startswith("<"):
val = float(titer[1:]) / 2
elif titer.startswith(">"):
val = float(titer[1:]) * 2
else:
val = float(titer)
except ValueError:
continue

test, ref_virus, serum, src_id = (entries[0], entries[1],entries[2],
entries[3])

Expand Down
4 changes: 2 additions & 2 deletions tests/data/titer_model/h3n2_titers_subset.tsv
Original file line number Diff line number Diff line change
@@ -1,12 +1,12 @@
A/Acores/11/2013 A/Alabama/5/2010 F27/10 NIMR_Sep2013_7-11.csv 80 hi
A/Acores/11/2013 A/Athens/112/2012 F16/12 NIMR_Sep2013_7-11.csv 640 hi
A/Acores/11/2013 A/Berlin/93/2011 T/CF11/12 NIMR_Sep2013_7-11.csv 640 hi
A/Acores/11/2013 A/Victoria/208/2009 F7/10 NIMR_Sep2013_7-11.csv 80 hi
A/Acores/11/2013 A/Victoria/208/2009 F7/10 NIMR_Sep2013_7-11.csv <80 hi
A/Acores/11/2013 A/Stockholm/18/2011 F28/11 NIMR_Sep2013_7-11.csv 160 hi
A/Acores/SU43/2012 A/Alabama/5/2010 F27/10 NIMR_Feb2013_18.csv 320 hi
A/Acores/SU43/2012 A/Hawaii/22/2012 F37/12 NIMR_Feb2013_18.csv 320 hi
A/Acores/11/2013 A/Perth/16/2009 F35/11 NIMR_Sep2013_7-11.csv 40 hi
A/Acores/SU43/2012 A/Texas/50/2012 F36/12 NIMR_Feb2013_18.csv 1280 hi
A/Acores/SU43/2012 A/Texas/50/2012 F36/12 NIMR_Feb2013_18.csv >1280 hi
A/Adana/116/2014 A/Iowa/19/2010 F15/11 NIMR_Feb2014_9-09.csv 80 hi
A/Cairo/63/2012 A/Texas/50/2012 F36/12 NIMR_Feb2013_16.csv 1280 hi
A/Cairo/63/2012 A/Texas/50/2012 F36/12 NIMR_Sep2013_7-04.csv 640 hi
Original file line number Diff line number Diff line change
Expand Up @@ -13,8 +13,8 @@ Test titer substitution model with alignment and tree inputs and a custom prefix
> --attribute-prefix custom_prefix_ \
> --output $TMP/titers-sub.json > /dev/null
Read titers from ../data/titers.tsv, found:
--- 61 strains
--- 62 strains
--- 15 data sources
--- 232 total measurements
--- 272 total measurements
$ grep custom_prefix_cTiterSub $TMP/titers-sub.json | wc -l
\s*120 (re)
4 changes: 2 additions & 2 deletions tests/functional/titers/cram/titers-sub-with-tree.t
Original file line number Diff line number Diff line change
Expand Up @@ -12,8 +12,8 @@ Test titer substitution model with alignment and tree inputs.
> --gene-names HA1 \
> --output $TMP/titers-sub.json > /dev/null
Read titers from ../data/titers.tsv, found:
--- 61 strains
--- 62 strains
--- 15 data sources
--- 232 total measurements
--- 272 total measurements
$ grep cTiterSub $TMP/titers-sub.json | wc -l
\s*120 (re)
4 changes: 2 additions & 2 deletions tests/functional/titers/cram/titers-tree-with-custom-prefix.t
Original file line number Diff line number Diff line change
Expand Up @@ -11,8 +11,8 @@ Test titer tree model with a custom prefix for the node data attributes in the o
> --attribute-prefix custom_prefix_ \
> --output $TMP/titers-tree.json > /dev/null
Read titers from ../data/titers.tsv, found:
--- 61 strains
--- 62 strains
--- 15 data sources
--- 232 total measurements
--- 272 total measurements
$ grep custom_prefix_cTiter $TMP/titers-tree.json | wc -l
\s*120 (re)
4 changes: 2 additions & 2 deletions tests/functional/titers/cram/titers-tree.t
Original file line number Diff line number Diff line change
Expand Up @@ -10,8 +10,8 @@ Test titer tree model.
> --titers ../data/titers.tsv \
> --output $TMP/titers-tree.json > /dev/null
Read titers from ../data/titers.tsv, found:
--- 61 strains
--- 62 strains
--- 15 data sources
--- 232 total measurements
--- 272 total measurements
$ grep cTiter $TMP/titers-tree.json | wc -l
\s*120 (re)