Initial testing framework by islas · Pull Request #2095 · wrf-model/WRF

islas · 2024-08-09T21:20:16Z

TYPE: enhancement

KEYWORDS: testing, regression, test framework

SOURCE: internal

DESCRIPTION OF CHANGES:
Problem:
The current regression suite code is complex, requires maintenance of multiple alternate repositories, and takes involved effort to add a new test making community contribution limited at best. Likewise, the complexity of the system reduces the likelihood of independent local testing of changes, leading to a development cycle of one-off commits done to reinvoke testing to see if meaningful commits fix the issues.

Solution:
This new proposed regression suite addresses these shortcomings in a number of discrete ways:

Modularize the testing framework to an generalized independent repo usable by any repo seeking to set up tests that can run locally, on HPC systems, and within any CI/CD framework
Write WRF-specific test scripts inside the WRF repo and in a manner that does not rely on specific layouts/hardware/etc. so long as WRF can compile and run on intended system (i.e. able to be run locally)
Write CI/CD tests in a simple and generally CI/CD framework-agnostic method where definitions of these also reside within the WRF repo
Utilize HPC resources in a safe manner to increase breadth of testing to allow testing of many more compilers and on similar hardware to the general use case of WRF

As a first pass at demonstrating this solution, this PR implements a simple set of compilation tests using GNU x86 configurations testing serial, sm, dm, and sm+dm selections. The CI/CD portion is done via GitHub workflow actions on a specific trigger event. The values and trigger methods are configurable, but this initial implementation will use the labeled trigger, which will initiate tests when compile-tests or all-tests is added as a label to a pull request.

TESTS CONDUCTED:

Testing of this github workflow was done in a separate fork also testing on Derecho. Both positive and negative tests were used to demonstrate respective output usefulness.

RELEASE NOTE:
Introduce a modularized testing framework that allows testing locally and natively on HPC systems that lives within the WRF repository

In order to run test scripts outside of a testing framework, the handling of environment setup should not be solely dependent on running within a dedicated test framework. This has the added benefit of compartmentalizing the duties of environment and dependency solving from running the tests. These environment scripts allow for the selection of a particular environment with the default being the fqdn of the current host. From there, arguments are routed using standard POSIX-sh to a respective script. In the case of Derecho (applicable to any system using lmod) all subsequent argument are treated as modules to load into the current session. The hostenv.sh script relies on one "argument" $AS_HOST being passed in via variable setting to facilitate selection. The helpers.sh script provides convenience features for substing checking in sh, delayed environment variable expansion via eval, and quick banner creation. The derecho.sh script is included as the first supported environment.

This script will facilitate the first tests. There are only three requirements of any given test script with the planned testing framework. If a different testing framework is used in the future, these requirements of the test scripts can and should be re-evaluated. The test script should : 1. Take the intended host / configuration environment as the first argument 2. Take the working directory to immediately change to as the second argument 3. Output some key phrase at the end of the test to denote success, anything else (non-zero exit code, no phrase but return zero) is a failure This particular compilation test script satisfies the above while also providing enough flexibility to select compile target, stanza configuration, parallel jobs, and other command-line options into the make build. Additionally, for convenience environment variables can be passed in as command-line options to the test script to modularize certain inputs.

Following the documentation of the hpc-workflows testing framework and the testing structure found in .ci/, a JSON file for a GNU compilation test was added. This test will compile the em_real core using the GNU Linux x86 stanza configuration. All other options are left as default. If this test is run using the derecho configuration the appropriate modules will attempt to be loaded. For non-derecho environments, per the testing structure under .ci/, if no configuration exists in .ci/hostenv.sh then the current environment wil be used verbatim.

This reusable workflow balances quick setup with github actions-specific features. It assumes that the tests can be controlled via a label being set in a PR. To coordinate PR vs primary branch testing, a suffix is generated using either the PR number or the branch name. This suffix is then used to relocate log files to an archival location in an organized fashion. Github artifacts are still used for failed test capture, but logs will also be moved to the archive location for quicker access if one has access to where these tests execute. To allow for parallelized testing available from hpc-workflows, the workflow can make duplicate directories of the repository that can each run their own test instance without clobbering files. Once tests are run, results are gathered, relocated to archival location, reported and printed to the screen, summarized into the actions summary page, and then packaged into an artifact if failure occured. Finally, the test label is removed if the named tests and label match.

This pipeline is triggered if any pushes occur on master or develop OR if a PR is labeled with an appropriate tag as specified by the tests within this workflow. Additionally, a specific label to trigger all tests can be used that will be removed from the PR when all tests finish, regardless of exit status. The pipeline makes extensive use of the reusable test_workflow.yml to instantiate tests on runners. This pipeline currently only includes the definition for one test to be run on a github runner with tags that satisfy "derecho". Likewise, other hard-coded values appearing in here assume a particular runner setup and environment.

islas · 2024-08-09T21:22:17Z

I'm using the approach we're using in MPAS to setup testing with a very limited minimal setup (simple compilation tests) at first to get something started.

The idea would be to then gradually translate the current tests to a usable format by this framework.

weiwangncar · 2024-08-10T02:56:10Z

The regression test results:

Test Type              | Expected  | Received |  Failed
= = = = = = = = = = = = = = = = = = = = = = = =  = = = =
Number of Tests        : 23           24
Number of Builds       : 60           57
Number of Simulations  : 158           150        0
Number of Comparisons  : 95           86        0

Failed Simulations are: 
None
Which comparisons are not bit-for-bit: 
None

…tablished

.ci/env/derecho.sh

.github/workflows/ci.yml

This PR introduces a set of tests that allows replication of the [WRF Coop Tests](https://github.com/kkeene44/wrf-coop/blob/update-v16/build.csh) which are normally run as regression tests for PRs. TYPE: enhancement KEYWORDS: testing, cicd, continuous integration SOURCE: internal DESCRIPTION OF CHANGES: Problem: The current regression tests found in the WRF Coop repository suffer from a few key design points: 1. located in a separate repository allowing code divergence and extra maintenance burden 2. confusing layout due to multiple repositories and data file locations 3. test logic obfuscation due to actual code to be executed auto-generated 4. limited execution tightly coupled to a containerized environment PR #2095 tried to remedy this using `hpc-workflows`, however the framework likewise suffered from issues: 1. manual unconventional environment management 2. duplication of effort between tests and lack of support for dependencies between common actions (e.g. re-using builds across multiple tests) 3. limited support for extensibility outside of argument manipulation Solution: This PR does not aim to entirely replace PR #2095 (notably the CI/CD GitHub worklow) and instead leverages this point in PR #2095: > 3. Write CI/CD tests in a simple and generally CI/CD framework-agnostic method where definitions of these also reside _within the WRF repo_ These tests follow this same mantra of _"CI/CD framework-agnostic"_ such that they can more or less be a drop in replacement only for the `hpc-workflows`-based tests. The tests will cover the WRF Coop Test Cases (provided is a default configuration for Derecho): | Tests | | | ------------- | ------------- | | em_real | em_realG | | em_realA | em_realH | | em_realB | em_realI | | em_realC | em_realJ | | em_realD | em_realK | | em_realE | em_realL | | em_realF | various build tests | The tests are now written in the [SANE Workflows](https://github.com/islas/sane_workflows) framework, which solves most of the issues faced by the other two setups. Data is still spread across multiple locations, but that is separate from the testing code. The structure of the tests is as follows: ``` .sane/ #< The root directory in WRF where the testing code is kept └── wrf #< A subfolder to make all python-imports look like `import wrf` ├── custom_actions │ └── run_wrf.py #< A module that has our custom reusable classes to setup initial conditions and model runs ├── hosts │ ├── derecho_envs.jsonc #< The environments that derecho.jsonc has │ └── derecho.jsonc #< Definition of derecho HPC system for this framework ├── scripts #< A subfolder to house all our shell helper scripts that do the bulk of the work │ ├── buildCMake.sh │ ├── buildMake.sh │ ├── compare_wrf.sh #< Use diffwrf to compare two runs │ ├── run_init.sh #< Configurable to run initial conditions (em_real.exe or ideal.exe) │ ├── run_wrf_restart.sh #< Runs wrf.exe again in previous run folder and compares history │ └── run_wrf.sh #< Runs wrf.exe └── tests #< Where our tests live ├── builds │ └── builds.py #< Python module that sets up ALL our compilation tests (make + cmake) └── regtests └── wrf_coop.py #< Python module that sets up the WRF Coop em_real* tests ``` Documentation for this new framework can be found at: https://sane-workflows.readthedocs.io/en/latest/ One could run these tests on Derecho using the following commands (inside a WRF repo clone): ```bash python3 -m venv .venv/wrf_testing source .venv/wrf_testing/bin/activate python3 -m pip install --pre sane-workflows # Runs the em_real test case sane_runner --path .sane/ --actions em_real --run ```

islas added 7 commits August 8, 2024 14:41

Add a framework to easily facilitate testing

5340db7

Add .log files to .gitignore for testing output

8901c50

islas requested a review from a team as a code owner August 9, 2024 21:20

islas added the compile-tests label Sep 12, 2024

Adding a note about permissions for PR label modification

9ff0800

islas added compile-tests and removed compile-tests labels Sep 12, 2024

Adjust triggers based on buried documentation to get proper permissions

73e7067

islas added compile-tests and removed compile-tests labels Sep 13, 2024

Removing the 'remove label' feature until a secure workflow can be es…

87190fe

…tablished

islas added compile-tests and removed compile-tests labels Sep 13, 2024

Use updated internal check on filenames for more tolerant regex

2b12cf2

islas added compile-tests and removed compile-tests labels Sep 13, 2024

mgduda reviewed Sep 16, 2024

View reviewed changes

.ci/env/derecho.sh Outdated Show resolved Hide resolved

Adjust comments to a neutral tone

b14a536

mgduda self-requested a review September 16, 2024 23:21

mgduda reviewed Sep 16, 2024

View reviewed changes

.github/workflows/ci.yml Show resolved Hide resolved

mgduda added compile-tests and removed compile-tests labels Sep 16, 2024

mgduda self-requested a review September 17, 2024 00:32

mgduda approved these changes Sep 17, 2024

View reviewed changes

kkeene44 approved these changes Sep 19, 2024

View reviewed changes

islas added compile-tests and removed compile-tests labels Sep 19, 2024

islas merged commit 958ce12 into wrf-model:release-v4.6.1 Sep 19, 2024

islas mentioned this pull request Dec 17, 2025

WRF Coop em_real Tests Using SANE Workflows #2264

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Initial testing framework#2095

Initial testing framework#2095
islas merged 12 commits intowrf-model:release-v4.6.1from
islas:initial-testing-framework

islas commented Aug 9, 2024 •

edited

Loading

Uh oh!

islas commented Aug 9, 2024

Uh oh!

weiwangncar commented Aug 10, 2024

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

islas commented Aug 9, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

islas commented Aug 9, 2024

Uh oh!

weiwangncar commented Aug 10, 2024

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

islas commented Aug 9, 2024 •

edited

Loading