Skip to content

Commit

Permalink
Use io.read_metadata during export
Browse files Browse the repository at this point in the history
Replaces a call to the older `utils.read_metadata` function with the
newer `io.read_metadata` function while processing metadata for export
to an Auspice JSON. This new function returns a pandas DataFrame indexed
by the first viable strain name column found in the metadata
file (removing this column from the data itself), while the original
function returns a dictionary indexed by strain name (keeping the
original named column like `strain` or `name` in the data). To avoid
changing the downstream code that consumes the metadata, this commit
converts the pandas DataFrame to a dictionary that matches the output of
the original function. The main advantage here is that the calling code
does not need to know what the id column is named, since
`io.read_metadata` handles this and indexed the data frame by that
column.

This commit also adds functional tests for the expected behavior of
export v2 with metadata inputs.

Fixes #905
  • Loading branch information
huddlej committed Apr 28, 2022
1 parent 83dd060 commit 483336d
Showing 1 changed file with 7 additions and 2 deletions.
9 changes: 7 additions & 2 deletions augur/export_v2.py
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,9 @@
import numbers
import re
from Bio import Phylo
from .utils import read_metadata, read_node_data, write_json, read_config, read_lat_longs, read_colors

from .io import read_metadata
from .utils import read_node_data, write_json, read_config, read_lat_longs, read_colors
from .validate import export_v2 as validate_v2, auspice_config_v2 as validate_auspice_config_v2, ValidateError

# Set up warnings & exceptions
Expand Down Expand Up @@ -992,7 +994,10 @@ def run_v2(args):

if args.metadata is not None:
try:
metadata_file, _ = read_metadata(args.metadata)
metadata_file = read_metadata(args.metadata).to_dict(orient="index")
for strain in metadata_file.keys():
if "strain" not in metadata_file[strain]:
metadata_file[strain]["strain"] = strain
except FileNotFoundError:
print(f"ERROR: meta data file ({args.metadata}) does not exist")
sys.exit(2)
Expand Down

0 comments on commit 483336d

Please sign in to comment.