Initial testing framework#2095
Merged
islas merged 12 commits intowrf-model:release-v4.6.1from Sep 19, 2024
Merged
Conversation
In order to run test scripts outside of a testing framework, the handling of environment setup should not be solely dependent on running within a dedicated test framework. This has the added benefit of compartmentalizing the duties of environment and dependency solving from running the tests. These environment scripts allow for the selection of a particular environment with the default being the fqdn of the current host. From there, arguments are routed using standard POSIX-sh to a respective script. In the case of Derecho (applicable to any system using lmod) all subsequent argument are treated as modules to load into the current session. The hostenv.sh script relies on one "argument" $AS_HOST being passed in via variable setting to facilitate selection. The helpers.sh script provides convenience features for substing checking in sh, delayed environment variable expansion via eval, and quick banner creation. The derecho.sh script is included as the first supported environment.
This script will facilitate the first tests. There are only three requirements of any given test script with the planned testing framework. If a different testing framework is used in the future, these requirements of the test scripts can and should be re-evaluated. The test script should : 1. Take the intended host / configuration environment as the first argument 2. Take the working directory to immediately change to as the second argument 3. Output some key phrase at the end of the test to denote success, anything else (non-zero exit code, no phrase but return zero) is a failure This particular compilation test script satisfies the above while also providing enough flexibility to select compile target, stanza configuration, parallel jobs, and other command-line options into the make build. Additionally, for convenience environment variables can be passed in as command-line options to the test script to modularize certain inputs.
Following the documentation of the hpc-workflows testing framework and the testing structure found in .ci/, a JSON file for a GNU compilation test was added. This test will compile the em_real core using the GNU Linux x86 stanza configuration. All other options are left as default. If this test is run using the derecho configuration the appropriate modules will attempt to be loaded. For non-derecho environments, per the testing structure under .ci/, if no configuration exists in .ci/hostenv.sh then the current environment wil be used verbatim.
This reusable workflow balances quick setup with github actions-specific features. It assumes that the tests can be controlled via a label being set in a PR. To coordinate PR vs primary branch testing, a suffix is generated using either the PR number or the branch name. This suffix is then used to relocate log files to an archival location in an organized fashion. Github artifacts are still used for failed test capture, but logs will also be moved to the archive location for quicker access if one has access to where these tests execute. To allow for parallelized testing available from hpc-workflows, the workflow can make duplicate directories of the repository that can each run their own test instance without clobbering files. Once tests are run, results are gathered, relocated to archival location, reported and printed to the screen, summarized into the actions summary page, and then packaged into an artifact if failure occured. Finally, the test label is removed if the named tests and label match.
This pipeline is triggered if any pushes occur on master or develop OR if a PR is labeled with an appropriate tag as specified by the tests within this workflow. Additionally, a specific label to trigger all tests can be used that will be removed from the PR when all tests finish, regardless of exit status. The pipeline makes extensive use of the reusable test_workflow.yml to instantiate tests on runners. This pipeline currently only includes the definition for one test to be run on a github runner with tags that satisfy "derecho". Likewise, other hard-coded values appearing in here assume a particular runner setup and environment.
Collaborator
Author
|
I'm using the approach we're using in MPAS to setup testing with a very limited minimal setup (simple compilation tests) at first to get something started. The idea would be to then gradually translate the current tests to a usable format by this framework. |
Collaborator
|
The regression test results: |
mgduda
reviewed
Sep 16, 2024
mgduda
reviewed
Sep 16, 2024
mgduda
approved these changes
Sep 17, 2024
kkeene44
approved these changes
Sep 19, 2024
islas
added a commit
that referenced
this pull request
Feb 19, 2026
This PR introduces a set of tests that allows replication of the [WRF Coop Tests](https://github.com/kkeene44/wrf-coop/blob/update-v16/build.csh) which are normally run as regression tests for PRs. TYPE: enhancement KEYWORDS: testing, cicd, continuous integration SOURCE: internal DESCRIPTION OF CHANGES: Problem: The current regression tests found in the WRF Coop repository suffer from a few key design points: 1. located in a separate repository allowing code divergence and extra maintenance burden 2. confusing layout due to multiple repositories and data file locations 3. test logic obfuscation due to actual code to be executed auto-generated 4. limited execution tightly coupled to a containerized environment PR #2095 tried to remedy this using `hpc-workflows`, however the framework likewise suffered from issues: 1. manual unconventional environment management 2. duplication of effort between tests and lack of support for dependencies between common actions (e.g. re-using builds across multiple tests) 3. limited support for extensibility outside of argument manipulation Solution: This PR does not aim to entirely replace PR #2095 (notably the CI/CD GitHub worklow) and instead leverages this point in PR #2095: > 3. Write CI/CD tests in a simple and generally CI/CD framework-agnostic method where definitions of these also reside _within the WRF repo_ These tests follow this same mantra of _"CI/CD framework-agnostic"_ such that they can more or less be a drop in replacement only for the `hpc-workflows`-based tests. The tests will cover the WRF Coop Test Cases (provided is a default configuration for Derecho): | Tests | | | ------------- | ------------- | | em_real | em_realG | | em_realA | em_realH | | em_realB | em_realI | | em_realC | em_realJ | | em_realD | em_realK | | em_realE | em_realL | | em_realF | various build tests | The tests are now written in the [SANE Workflows](https://github.com/islas/sane_workflows) framework, which solves most of the issues faced by the other two setups. Data is still spread across multiple locations, but that is separate from the testing code. The structure of the tests is as follows: ``` .sane/ #< The root directory in WRF where the testing code is kept └── wrf #< A subfolder to make all python-imports look like `import wrf` ├── custom_actions │ └── run_wrf.py #< A module that has our custom reusable classes to setup initial conditions and model runs ├── hosts │ ├── derecho_envs.jsonc #< The environments that derecho.jsonc has │ └── derecho.jsonc #< Definition of derecho HPC system for this framework ├── scripts #< A subfolder to house all our shell helper scripts that do the bulk of the work │ ├── buildCMake.sh │ ├── buildMake.sh │ ├── compare_wrf.sh #< Use diffwrf to compare two runs │ ├── run_init.sh #< Configurable to run initial conditions (em_real.exe or ideal.exe) │ ├── run_wrf_restart.sh #< Runs wrf.exe again in previous run folder and compares history │ └── run_wrf.sh #< Runs wrf.exe └── tests #< Where our tests live ├── builds │ └── builds.py #< Python module that sets up ALL our compilation tests (make + cmake) └── regtests └── wrf_coop.py #< Python module that sets up the WRF Coop em_real* tests ``` Documentation for this new framework can be found at: https://sane-workflows.readthedocs.io/en/latest/ One could run these tests on Derecho using the following commands (inside a WRF repo clone): ```bash python3 -m venv .venv/wrf_testing source .venv/wrf_testing/bin/activate python3 -m pip install --pre sane-workflows # Runs the em_real test case sane_runner --path .sane/ --actions em_real --run ```
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
TYPE: enhancement
KEYWORDS: testing, regression, test framework
SOURCE: internal
DESCRIPTION OF CHANGES:
Problem:
The current regression suite code is complex, requires maintenance of multiple alternate repositories, and takes involved effort to add a new test making community contribution limited at best. Likewise, the complexity of the system reduces the likelihood of independent local testing of changes, leading to a development cycle of one-off commits done to reinvoke testing to see if meaningful commits fix the issues.
Solution:
This new proposed regression suite addresses these shortcomings in a number of discrete ways:
As a first pass at demonstrating this solution, this PR implements a simple set of compilation tests using GNU x86 configurations testing serial, sm, dm, and sm+dm selections. The CI/CD portion is done via GitHub workflow actions on a specific trigger event. The values and trigger methods are configurable, but this initial implementation will use the
labeledtrigger, which will initiate tests whencompile-testsorall-testsis added as a label to a pull request.TESTS CONDUCTED:
RELEASE NOTE:
Introduce a modularized testing framework that allows testing locally and natively on HPC systems that lives within the WRF repository