Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
20 changes: 17 additions & 3 deletions .github/workflows/gh-aw-test-improvement.lock.yml

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

20 changes: 17 additions & 3 deletions .github/workflows/gh-aw-test-improver.lock.yml

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

18 changes: 16 additions & 2 deletions .github/workflows/gh-aw-test-improver.md
Original file line number Diff line number Diff line change
Expand Up @@ -126,7 +126,21 @@ Identify under-tested code paths, add focused tests, and remove or consolidate d
- Run the most relevant test command(s). **All tests — new and existing — must pass.** If the full suite is too heavy, run targeted tests.
- If required commands, tests, or coverage cannot be run, call `noop`. Do not open a PR with untested test code.

## Step 5: Quality Gate — Test Value Check
## Step 5: Stability check — run new tests repeatedly

New tests that pass once may still be flaky. Before filing a PR, verify stability by running each new or modified test multiple times.

1. Run each new or modified test **at least 5 times** in sequence and confirm every run passes.
- Use the test framework's built-in repeat/count flag when available (e.g., `go test -count=5`, `pytest -x --count 5` with `pytest-repeat`, `--repeat 5` in Jest/Vitest).
- If no built-in mechanism exists, use a simple shell loop: `for i in $(seq 1 5); do <test-command> || exit 1; done`
2. If any run fails intermittently, investigate the root cause before proceeding. Common sources of flakiness:
- Reliance on timing, sleep, or wall-clock assertions
- Shared mutable state between test cases
- Non-deterministic iteration order (e.g., map/set ordering)
- Dependence on external services or network
3. If the test cannot be made reliably stable, do not include it in the PR. Call `noop` if no stable tests remain.

## Step 6: Quality Gate — Test Value Check

Before creating the PR, evaluate each new test:

Expand All @@ -138,7 +152,7 @@ Before creating the PR, evaluate each new test:

If the tests don't pass this bar, call `noop`. Low-value tests are worse than no tests — they create maintenance burden and false confidence.

## Step 6: Create the PR
## Step 7: Create the PR

1. Commit the changes locally.
2. Call `create_pull_request` with:
Expand Down
Loading