Add functional tests with Cram #542

huddlej · 2020-04-19T06:14:02Z

Adds functional tests of augur’s command line interface with Cram. These tests complement existing unit tests of individual augur Python functions by running augur commands on the shell and confirming that these commands:

execute without any errors
produce exactly the expected outputs for the given inputs

These tests can reveal bugs resulting from untested internal functions or untested combinations fo internal functions.

Functional tests should either:

suitably test a single augur command with an eponymously named Cram file in tests/functional/ (e.g., mask.t for augur mask)

OR

test a complete build with augur commands with an appropriately named Cram file in tests/builds/ (e.g., zika.t for the example Zika build)

Functional tests of complete builds should supersede existing Snakemake-based tests. The Snakemake files are still useful for producing new expected inputs and outputs for Cram, but Snakemake does not need to be run as part of Travis CI after we have migrated all example builds to Cram.

Running tests

Install augur with dev dependencies (pip install .[dev] from the GitHub repo). Run all functional tests per command and build with Cram.

cram tests/functional/*.t tests/builds/*.t

Functional tests of specific commands

Functional tests of specific commands consist of a single Cram file per test and a corresponding directory of expected inputs and outputs to use for comparison of test results.

The Cram file should test most reasonable combinations of command arguments and flags.

Functional tests of example builds

Functional tests of example builds use output from a real Snakemake workflow as expected inputs and outputs. These tests should confirm that all steps of a workflow can execute and produce the expected output. These tests reflect actual augur usage in workflows and are not intended to comprehensively test interfaces for specific augur commands.

The Cram file should replicate the example workflow from start to end. These tests should use the output of the Snakemake workflow (e.g., files in zika/results/ for the Zika build test) as the expected inputs and outputs.

Comparing outputs of augur commands

Compare deterministic outputs of augur commands with a diff between the expected and observed output files. For extremely simple deterministic outputs, use the expected text written to standard output instead of creating a separate expected output file.

To compare trees with stochastic branch lengths:

provide a fixed random seed to the tree builder executable (e.g., --tree-builder-args "-seed 314159" for the “iqtree” method of augur tree)
use scripts/diff_trees.py instead of diff and optionally provide a specific number to --significant-digits to limit the precision that should be considered in the diff

To compare JSON outputs with stochastic numerical values, use scripts/diff_jsons.py with the appropriate --significant-digits argument.

Both tree and JSON comparison scripts rely on deepdiff for underlying comparisons.

Side effects

Travis CI

To get Cram tests to work as expected with Travis CI, I needed to modify the Travis config to install Python dev dependencies with pip instead if pip3. This change ensures that the dev dependencies are installed in the CI’s conda environment instead of the global Python environment and allows Python commands executed in Cram files to access dev dependencies.

This PR also updates the Travis config to run Cram along with pytest and the Snakemake builds.

Random seed flag for augur refine

In an attempt to produce consistent augur refine outputs from run to run (even locally on my laptop), I added a --seed flag that allows the user to set numpy’s global random seed to a specific value. This flag is only useful for this kind of testing or for debugging. It also doesn’t completely fulfill its purpose since TreeTime still produces overly stochastic outputs with the example Zika data even when the random seed is fixed.

I’ve kept this flag here in case this pattern becomes helpful for other augur commands.

codecov · 2020-04-19T06:14:09Z

Codecov Report

Merging #542 into master will decrease coverage by 0.05%.
The diff coverage is 40.00%.

@@            Coverage Diff             @@
##           master     #542      +/-   ##
==========================================
- Coverage   19.22%   19.16%   -0.06%     
==========================================
  Files          31       31              
  Lines        5072     5072              
  Branches     1288     1289       +1     
==========================================
- Hits          975      972       -3     
- Misses       4074     4077       +3     
  Partials       23       23

Impacted Files	Coverage Δ
augur/refine.py	`5.52% <25.00%> (+0.49%)`	⬆️
augur/mask.py	`100.00% <100.00%> (ø)`
augur/titer_model.py	`18.61% <0.00%> (-0.30%)`	⬇️
augur/utils.py	`22.89% <0.00%> (-0.18%)`	⬇️
augur/frequency_estimators.py	`33.71% <0.00%> (-0.13%)`	⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 51fefbf...0878320. Read the comment docs.

conda does not provide a pip3 command, but the base Travis python does. This means pip3 install places the augur development dependencies in the global python environment instead of the conda environment. As a result, cram tests fail when they rely on python packages that are only installed as dev dependencies because cram's "python" is the conda environment's "python". This commit should install development dependencies in the conda environment instead of the global environment and make those packages available to the python command inside cram tests.

Introduces two different, complementary approaches to functional testing with Cram. The first approach basically copies the commands already executed by the Snakefiles in the tests/builds directory into the Cram format. The zika build, for example, is partially represented by zika.t in that builds directory. The second approach tries to more comprehensively test a specific augur command with a variety of reasonable inputs. The mask.t file represents an example of that type of test for augur mask.

augur mask now supports multiple masking inputs and no longer requires the `--mask` argument. This commit updates the augur mask tests to reflect these new command line arguments and also modifies informational output from the `read_bed_file` function to clarify that the reported number of sites to mask only reflects the BED file contents and not the other mask arguments.

Run functional tests when running the generic "run_tests" script and when full unit tests are also being run. We skip functional tests when the user requests a subset of unit tests to facilitate rapid testing during test-driven development. Cram tests use pushd and popd which are bash-specific and not available in Travis's default shell, so we specify the shell for Cram tests to use.

Also places one sentence per line in the testing section and clarifies how to run a subset of unit tests.

huddlej · 2020-05-02T18:17:56Z

@tsibley @jameshadfield This approach with Cram seems to be working well now, so I am merging this PR now. This will allow us to start integrating other work from the community including new functional tests. I'm happy to make any revisions to this approach generally in future PRs, if you have suggestions.

huddlej requested review from tsibley and jameshadfield April 19, 2020 06:14

huddlej changed the title ~~Test with cram~~ Add functional tests with Cram Apr 19, 2020

vanguard737 mentioned this pull request Apr 20, 2020

[RFC] Functional tests with Cram #543

Open

huddlej added 3 commits May 2, 2020 10:39

huddlej force-pushed the test-with-cram branch 2 times, most recently from 496d320 to 0878320 Compare May 2, 2020 17:55

huddlej added 2 commits May 2, 2020 10:55

Describe how to write and run functional tests with Cram

0878320

Also places one sentence per line in the testing section and clarifies how to run a subset of unit tests.

huddlej merged commit 704f369 into master May 2, 2020

huddlej deleted the test-with-cram branch May 2, 2020 18:18

vanguard737 mentioned this pull request May 16, 2020

Execute Cram via pytest #549

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add functional tests with Cram #542

Add functional tests with Cram #542

huddlej commented Apr 19, 2020

codecov bot commented Apr 19, 2020 •

edited

Loading

huddlej commented May 2, 2020

Add functional tests with Cram #542

Add functional tests with Cram #542

Conversation

huddlej commented Apr 19, 2020

Running tests

Functional tests of specific commands

Functional tests of example builds

Comparing outputs of augur commands

Side effects

Travis CI

Random seed flag for augur refine

codecov bot commented Apr 19, 2020 • edited Loading

Codecov Report

huddlej commented May 2, 2020

codecov bot commented Apr 19, 2020 •

edited

Loading