Skip to content

feat: unit test metric tracking#40

Merged
terrykong merged 5 commits intomainfrom
unit-test-tracking
Mar 26, 2025
Merged

feat: unit test metric tracking#40
terrykong merged 5 commits intomainfrom
unit-test-tracking

Conversation

@terrykong
Copy link
Collaborator

@terrykong terrykong commented Mar 24, 2025

What does this PR do ?

Tracking metrics in unit tests

Unit tests may also log metrics to a fixture. The fixture is called tracker and has the following API:

# Track an arbitrary metric (must be json serializable)
tracker.track(metric_name, metric_value)
# Log the maximum memory across the entire cluster. Okay for tests since they are run serially.
tracker.log_max_mem(metric_name)
# Returns the maximum memory. Useful if you are measuring changes in memory.
tracker.get_max_mem()

Including the tracker fixture also tracks the elapsed time for the test implicitly.

Here is an example test:

def test_exponentiate(tracker):
    starting_mem = tracker.get_max_mem()
    base = 2
    exponent = 4
    result = base ** exponent
    tracker.track("result", result)
    tracker.log_max_mem("memory_after_exponentiating")
    change_in_mem = tracker.get_max_mem() - starting_mem
    tracker.track("change_in_mem", change_in_mem)
    assert result == 16

Which would produce this file in tests/unit/unit_results.json:

{
  "exit_status": 0,
  "git_commit": "f1062bd3fd95fc64443e2d9ee4a35fc654ba897e",
  "start_time": "2025-03-24 23:34:12",
  "metrics": {
    "test_hf_ray_policy::test_hf_policy_generation": {
      "avg_prob_mult_error": 1.0000039339065552,
      "mean_lps": -1.5399343967437744,
      "_elapsed": 17.323044061660767
    }
  },
  "gpu_types": [
    "NVIDIA H100 80GB HBM3"
  ],
  "coverage": 24.55897613282601
}

:::{tip}
Past unit test results are logged in tests/unit/unit_results/. These are helpful to view trends over time and commits.

Here's an example jq command to view trends:

jq -r '[.start_time, .git_commit, .metrics["test_hf_ray_policy::test_hf_policy_generation"].avg_prob_mult_error] | @tsv' tests/unit/unit_results/*

# Example output:
#2025-03-24 23:35:39     778d288bb5d2edfd3eec4d07bb7dffffad5ef21b        1.0000039339065552
#2025-03-24 23:36:37     778d288bb5d2edfd3eec4d07bb7dffffad5ef21b        1.0000039339065552
#2025-03-24 23:37:37     778d288bb5d2edfd3eec4d07bb7dffffad5ef21b        1.0000039339065552
#2025-03-24 23:38:14     778d288bb5d2edfd3eec4d07bb7dffffad5ef21b        1.0000039339065552
#2025-03-24 23:38:50     778d288bb5d2edfd3eec4d07bb7dffffad5ef21b        1.0000039339065552

:::

@github-actions github-actions bot added documentation Improvements or additions to documentation CI Relating to CI labels Mar 24, 2025
@terrykong terrykong force-pushed the unit-test-tracking branch from d9efb6c to f99d7d6 Compare March 24, 2025 22:58
Signed-off-by: Terry Kong <terryk@nvidia.com>
Signed-off-by: Terry Kong <terryk@nvidia.com>
Signed-off-by: Terry Kong <terryk@nvidia.com>
Signed-off-by: Terry Kong <terryk@nvidia.com>
Signed-off-by: Terry Kong <terryk@nvidia.com>
@terrykong
Copy link
Collaborator Author

Example step summary: https://github.com/NVIDIA/reinforcer/actions/runs/14072895609#summary-39411472888

@terrykong terrykong merged commit 084d6fa into main Mar 26, 2025
15 checks passed
@terrykong terrykong deleted the unit-test-tracking branch March 26, 2025 05:02
@terrykong terrykong added testing Related to testing and removed documentation Improvements or additions to documentation labels Mar 26, 2025
@terrykong terrykong linked an issue Apr 1, 2025 that may be closed by this pull request
yfw pushed a commit that referenced this pull request Apr 2, 2025
Signed-off-by: Terry Kong <terryk@nvidia.com>
Signed-off-by: Yi-Fu Wu <yifu.wu@gmail.com>
KiddoZhu pushed a commit that referenced this pull request May 6, 2025
Signed-off-by: Terry Kong <terryk@nvidia.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CI Relating to CI testing Related to testing

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Track GPU memory and runtime and arbitrary metrics in tests

2 participants