CNTRLPLANE-1724:Fix suite parallelism and test output capture #46

gangwgr · 2025-11-13T17:43:41Z

This commit fixes two important issues in the OTE framework:

Respect suite's Parallelism field in run-suite command
- Modified pkg/cmd/cmdrun/runsuite.go to check suite.Parallelism
- If suite.Parallelism > 0, use it instead of command-line flag
- This allows suites to enforce serial execution (Parallelism: 1)
- Fixes issue where serial/slow suites ran tests in parallel
Capture and preserve test output (g.By() calls)
- Modified pkg/ginkgo/util.go to properly capture test logging
- Use io.MultiWriter to write to both buffer and stderr
- Filter out Ginkgo reporter summary lines from JSON output
- Preserves real-time visibility while capturing programmatic output
- Fixes issue where test output was empty in JSON results

These changes enable proper serial test execution and detailed logging for test suites using the OTE framework.

This commit fixes two important issues in the OTE framework: 1. **Respect suite's Parallelism field in run-suite command** - Modified pkg/cmd/cmdrun/runsuite.go to check suite.Parallelism - If suite.Parallelism > 0, use it instead of command-line flag - This allows suites to enforce serial execution (Parallelism: 1) - Fixes issue where serial/slow suites ran tests in parallel 2. **Capture and preserve test output (g.By() calls)** - Modified pkg/ginkgo/util.go to properly capture test logging - Use io.MultiWriter to write to both buffer and stderr - Filter out Ginkgo reporter summary lines from JSON output - Preserves real-time visibility while capturing programmatic output - Fixes issue where test output was empty in JSON results These changes enable proper serial test execution and detailed logging for test suites using the OTE framework.

openshift-ci-robot · 2025-11-13T17:43:46Z

@gangwgr: This pull request references CNTRLPLANE-1724 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the sub-task to target the "4.21.0" version, but no target version was set.

Details

In response to this:

This commit fixes two important issues in the OTE framework:

Respect suite's Parallelism field in run-suite command

Modified pkg/cmd/cmdrun/runsuite.go to check suite.Parallelism

If suite.Parallelism > 0, use it instead of command-line flag

This allows suites to enforce serial execution (Parallelism: 1)

Fixes issue where serial/slow suites ran tests in parallel

Capture and preserve test output (g.By() calls)

Modified pkg/ginkgo/util.go to properly capture test logging

Use io.MultiWriter to write to both buffer and stderr

Filter out Ginkgo reporter summary lines from JSON output

Preserves real-time visibility while capturing programmatic output

Fixes issue where test output was empty in JSON results

These changes enable proper serial test execution and detailed logging for test suites using the OTE framework.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

openshift-ci · 2025-11-13T17:43:48Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: gangwgr
Once this PR has been reviewed and has the lgtm label, please assign deads2k for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details

Needs approval from an approver in each of these files:

OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

openshift-ci · 2025-11-13T17:43:54Z

Hi @gangwgr. Thanks for your PR.

I'm waiting for a github.com member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

wangke19 · 2025-11-17T04:36:55Z

pkg/ginkgo/util.go

+				originalWriter := ginkgo.GinkgoWriter
+				ginkgo.GinkgoWriter = captureWriter
+				defer func() {
+					ginkgo.GinkgoWriter = originalWriter
+				}()


Here, we modify the global ginkgo.GinkgoWriter, remove this entirely - it's not needed! The captureWriter parameter passed to RunSpec is sufficient.

wangke19 · 2025-11-17T04:44:28Z

pkg/ginkgo/util.go

+
+				// Create a buffer to capture g.By() output AND write to stderr for real-time visibility
+				var outputBuffer bytes.Buffer
+				multiWriter := io.MultiWriter(&outputBuffer, os.Stderr)


Here simplify by Using Ginkgo's Built-in Separation, the current approach creates a 40-line filter function, but Ginkgo already separates test output from reporter output!

// Just capture to buffer for stderr visibility var outputBuffer bytes.Buffer multiWriter := io.MultiWriter(&outputBuffer, os.Stderr) captureWriter := ginkgo.NewWriter(multiWriter) ginkgo.GetSuite().RunSpec(spec, ..., captureWriter, ...) // Use Ginkgo's pre-separated output - NO FILTERING NEEDED! result.Output = summary.CapturedGinkgoWriterOutput // Already clean! result.Error = summary.CapturedStdOutErr

Benefits:

✅ Removes 40+ lines of filtering code (Delete filterGinkgoReporterOutput() function (40 lines removed!))

✅ More robust (uses Ginkgo's API instead of text parsing)

✅ Thread-safe (no global manipulation)

✅ Future-proof (works with any Ginkgo version)

✅ Still maintains real-time stderr visibility

- Remove redundant global GinkgoWriter modification - Use CapturedGinkgoWriterOutput directly instead of custom filtering - Remove 40-line filterGinkgoReporterOutput function - Cleaner code leveraging Ginkgo's native functionality Ginkgo already separates test output from reporter output in CapturedGinkgoWriterOutput, so the custom filtering is unnecessary. The captureWriter parameter passed to RunSpec is sufficient.

wangke19 · 2025-11-17T06:39:53Z

/lgtm

openshift-ci · 2025-11-17T06:39:56Z

@wangke19: changing LGTM is restricted to collaborators

Details

In response to this:

/lgtm

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

gangwgr · 2025-11-17T07:30:13Z

Suite results 
./cmd/cluster-kube-apiserver-operator-tests/cluster-kube-apiserver-operator-tests-ext \
    run-suite openshift/cluster-kube-apiserver-operator/conformance/parallel \
    --junit-path=/tmp/junit.xml
  Running Suite:  - /Users/rgangwar/Downloads/backupoffice/cluster-kube-apiserver-operator
  ========================================================================================
  Random Seed: 1763363798 - will randomize all specs

  Will run 1 of 1 specs
  ------------------------------
  [Jira:kube-apiserver][sig-api-machinery] parallel sanity test should always pass [Suite:openshift/cluster-kube-apiserver-operator/conformance/parallel]
  github.com/openshift/cluster-kube-apiserver-operator/test/e2e/debug.go:9
    STEP: Executing a simple parallel test @ 11/17/25 12:46:38.422
    STEP: Parallel test completed successfully @ 11/17/25 12:46:38.422
  • [0.001 seconds]
  ------------------------------

  Ran 1 of 1 Specs in 0.001 seconds
  SUCCESS! -- 1 Passed | 0 Failed | 0 Pending | 0 Skipped
[
  {
    "name": "[Jira:kube-apiserver][sig-api-machinery] parallel sanity test should always pass [Suite:openshift/cluster-kube-apiserver-operator/conformance/parallel]",
    "lifecycle": "blocking",
    "duration": 1,
    "startTime": "2025-11-17 07:16:38.421233 UTC",
    "endTime": "2025-11-17 07:16:38.422781 UTC",
    "result": "passed",
    "output": "  STEP: Executing a simple parallel test @ 11/17/25 12:46:38.422\n  STEP: Parallel test completed successfully @ 11/17/25 12:46:38.422\n"
  }
]%                                                                                                                                                                                                                               rgangwar@rgangwar-mac cluster-kube-apiserver-operator % ./cmd/cluster-kube-apiserver-operator-tests/cluster-kube-apiserver-operator-tests-ext \
    run-suite openshift/cluster-kube-apiserver-operator/conformance/serial \  
    --junit-path=/tmp/junit.xml
  Running Suite:  - /Users/rgangwar/Downloads/backupoffice/cluster-kube-apiserver-operator
  ========================================================================================
  Random Seed: 1763363838 - will randomize all specs

  Will run 1 of 1 specs
  ------------------------------
  [Jira:kube-apiserver][sig-api-machinery] serial sanity test should always pass [Serial][Suite:openshift/cluster-kube-apiserver-operator/conformance/serial]
  github.com/openshift/cluster-kube-apiserver-operator/test/e2e/debug.go:17
    STEP: Executing a simple serial test @ 11/17/25 12:47:18.543
    STEP: Serial test completed successfully @ 11/17/25 12:47:18.543
  • [0.001 seconds]
  ------------------------------

  Ran 1 of 1 Specs in 0.001 seconds
  SUCCESS! -- 1 Passed | 0 Failed | 0 Pending | 0 Skipped
[
  {
    "name": "[Jira:kube-apiserver][sig-api-machinery] serial sanity test should always pass [Serial][Suite:openshift/cluster-kube-apiserver-operator/conformance/serial]",
    "lifecycle": "blocking",
    "duration": 1,
    "startTime": "2025-11-17 07:17:18.542010 UTC",
    "endTime": "2025-11-17 07:17:18.543518 UTC",
    "result": "passed",
    "output": "  STEP: Executing a simple serial test @ 11/17/25 12:47:18.543\n  STEP: Serial test completed successfully @ 11/17/25 12:47:18.543\n"
  }
]%                                                                                                                                                                                                                               rgangwar@rgangwar-mac cluster-kube-apiserver-operator % ./cmd/cluster-kube-apiserver-operator-tests/cluster-kube-apiserver-operator-tests-ext run-suite "openshift/cluster-kube-apiserver-operator/optional/slow"     
[
  {
    "name": "[Jira:kube-apiserver][sig-api-machinery] slow suite sanity tests should pass slow test case 1 [Slow][Suite:openshift/cluster-kube-apiserver-operator/optional/slow]",
    "lifecycle": "blocking",
    "duration": 102,
    "startTime": "2025-11-17 07:25:40.572744 UTC",
    "endTime": "2025-11-17 07:25:40.674975 UTC",
    "result": "passed",
    "output": "  STEP: Starting slow test case 1 @ 11/17/25 12:55:40.573\n  STEP: Slow test case 1 completed with sleep simulation @ 11/17/25 12:55:40.674\n"
  },
  {
    "name": "[Jira:kube-apiserver][sig-api-machinery] slow suite sanity tests should pass slow test case 2 with timeout [Slow][Timeout:5m][Suite:openshift/cluster-kube-apiserver-operator/optional/slow]",
    "lifecycle": "blocking",
    "duration": 51,
    "startTime": "2025-11-17 07:25:40.686199 UTC",
    "endTime": "2025-11-17 07:25:40.737800 UTC",
    "result": "passed",
    "output": "  STEP: Starting slow test case 2 with timeout annotation @ 11/17/25 12:55:40.686\n  STEP: Slow test case 2 with timeout completed @ 11/17/25 12:55:40.737\n"
  },
  {
    "name": "[Jira:kube-apiserver][sig-api-machinery] slow suite sanity tests should pass slow serial test case 3 [Slow][Serial][Suite:openshift/cluster-kube-apiserver-operator/optional/slow]",
    "lifecycle": "blocking",
    "duration": 81,
    "startTime": "2025-11-17 07:25:40.748631 UTC",
    "endTime": "2025-11-17 07:25:40.830277 UTC",
    "result": "passed",
    "output": "  STEP: Starting slow serial test case 3 @ 11/17/25 12:55:40.748\n  STEP: Performing sequential operation 1 @ 11/17/25 12:55:40.748\n  STEP: Performing sequential operation 2 @ 11/17/25 12:55:40.83\n  STEP: Slow serial test case 3 completed successfully @ 11/17/25 12:55:40.83\n"
  },
  {
    "name": "[Jira:kube-apiserver][sig-api-machinery] slow suite sanity tests should pass slow test case 4 with multiple steps [Slow][Suite:openshift/cluster-kube-apiserver-operator/optional/slow]",
    "lifecycle": "blocking",
    "duration": 61,
    "startTime": "2025-11-17 07:25:40.841653 UTC",
    "endTime": "2025-11-17 07:25:40.903343 UTC",
    "result": "passed",
    "output": "  STEP: Step 1: Initialize test data @ 11/17/25 12:55:40.841\n  STEP: Step 2: Process test data @ 11/17/25 12:55:40.841\n  STEP: Step 3: Validate test data @ 11/17/25 12:55:40.903\n  STEP: Slow test case 4 completed with all steps @ 11/17/25 12:55:40.903\n"
  },
  {
    "name": "[Jira:kube-apiserver][sig-api-machinery] slow suite sanity tests should pass slow timeout test case 5 [Slow][Timeout:10m][Suite:openshift/cluster-kube-apiserver-operator/optional/slow]",
    "lifecycle": "blocking",
    "duration": 121,
    "startTime": "2025-11-17 07:25:40.913644 UTC",
    "endTime": "2025-11-17 07:25:41.035150 UTC",
    "result": "passed",
    "output": "  STEP: Starting slow timeout test case 5 @ 11/17/25 12:55:40.913\n  STEP: Simulating long-running operation @ 11/17/25 12:55:40.913\n  STEP: Long-running operation completed @ 11/17/25 12:55:41.034\n  STEP: Slow timeout test case 5 finished successfully @ 11/17/25 12:55:41.035\n"
  },
  {
    "name": "[Jira:kube-apiserver][sig-api-machinery] slow suite sanity tests should pass slow serial test case 6 with validation [Slow][Serial][Suite:openshift/cluster-kube-apiserver-operator/optional/slow]",
    "lifecycle": "blocking",
    "duration": 92,
    "startTime": "2025-11-17 07:25:41.047635 UTC",
    "endTime": "2025-11-17 07:25:41.139679 UTC",
    "result": "passed",
    "output": "  STEP: Starting slow serial test case 6 @ 11/17/25 12:55:41.047\n  STEP: Validating system state before test @ 11/17/25 12:55:41.048\n  STEP: Executing critical sequential operation @ 11/17/25 12:55:41.048\n  STEP: Validating system state after test @ 11/17/25 12:55:41.139\n  STEP: Slow serial test case 6 completed with all validations @ 11/17/25 12:55:41.139\n"
  }
]%

This commit adds a new serial suite for testing sequential execution and includes comprehensive debug tests to validate the output capture improvements. The changes also remove suite-level parallelism override to ensure consistent behavior across test runs.

rioliu-rh · 2025-11-17T11:03:46Z

Root Cause Investigation - Critical Finding

I tested the current code empirically with tests that actually produce output (using By() calls), and discovered something important:

The Current Code ALREADY WORKS

Test results:

# Test WITH By() calls - output IS captured:
./example-tests run-test "[sig-testing] openshift-tests-extension TEMP: test with By() output"
→ output_length: 226
→ Output contains all STEP messages from By() calls ✅

# Test WITHOUT By() calls - no output (expected):
./example-tests run-test "[sig-testing] openshift-tests-extension ordered should run beforeAll once"
→ output_length: 0
→ No output because test has no By() calls (just assertions) ✅

Both run-test and run-suite capture output correctly with the current code.

Why the PR Might Exist

Possible explanations:

The bug was already fixed - Commit 00b85ac added ginkgo.GinkgoLogr = GinkgoLogrFunc(ginkgo.GinkgoWriter) which might have resolved the issue
Misdiagnosis - The PR author tested with tests that don't produce output (no By() calls)
Different scenario - There's a specific edge case we haven't tested

Impact on PR Review

Given that:

✅ Current code captures output correctly
❌ PR introduces unnecessary complexity (MultiWriter, outputBuffer, reporterConfigCopy)
❌ PR has architectural issues (suite.Parallelism override)

Recommendation:

Close or reject this PR
If there IS a real bug, the PR author needs to provide a reproducible test case showing empty output with the current code
The example tests don't have By() calls, so they can't be used to verify this bug

Would you like me to provide my test code so you can verify the output capture bug doesn't exist in the current codebase?

openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Nov 13, 2025

openshift-ci bot requested review from deads2k and jupierce November 13, 2025 17:43

openshift-ci bot added the needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. label Nov 13, 2025

wangke19 reviewed Nov 17, 2025

View reviewed changes

gangwgr force-pushed the fix-parallelism-and-logging branch from 007f65f to 0259757 Compare November 17, 2025 08:44

gangwgr closed this Nov 20, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CNTRLPLANE-1724:Fix suite parallelism and test output capture #46

CNTRLPLANE-1724:Fix suite parallelism and test output capture #46

Uh oh!

gangwgr commented Nov 13, 2025

Uh oh!

openshift-ci-robot commented Nov 13, 2025 •

edited by openshift-ci bot

Loading

Uh oh!

openshift-ci bot commented Nov 13, 2025

Uh oh!

openshift-ci bot commented Nov 13, 2025

Uh oh!

wangke19 Nov 17, 2025

Uh oh!

gangwgr Nov 17, 2025

Uh oh!

wangke19 Nov 17, 2025

Uh oh!

wangke19 Nov 17, 2025 •

edited

Loading

Uh oh!

gangwgr Nov 17, 2025

Uh oh!

wangke19 commented Nov 17, 2025

Uh oh!

openshift-ci bot commented Nov 17, 2025

Uh oh!

gangwgr commented Nov 17, 2025

Uh oh!

rioliu-rh commented Nov 17, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

CNTRLPLANE-1724:Fix suite parallelism and test output capture #46

CNTRLPLANE-1724:Fix suite parallelism and test output capture #46

Uh oh!

Conversation

gangwgr commented Nov 13, 2025

Uh oh!

openshift-ci-robot commented Nov 13, 2025 • edited by openshift-ci bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

openshift-ci bot commented Nov 13, 2025

Uh oh!

openshift-ci bot commented Nov 13, 2025

Uh oh!

wangke19 Nov 17, 2025

Choose a reason for hiding this comment

Uh oh!

gangwgr Nov 17, 2025

Choose a reason for hiding this comment

Uh oh!

wangke19 Nov 17, 2025

Choose a reason for hiding this comment

Uh oh!

wangke19 Nov 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

gangwgr Nov 17, 2025

Choose a reason for hiding this comment

Uh oh!

wangke19 commented Nov 17, 2025

Uh oh!

openshift-ci bot commented Nov 17, 2025

Uh oh!

gangwgr commented Nov 17, 2025

Uh oh!

rioliu-rh commented Nov 17, 2025

Root Cause Investigation - Critical Finding

The Current Code ALREADY WORKS

Why the PR Might Exist

Impact on PR Review

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

openshift-ci-robot commented Nov 13, 2025 •

edited by openshift-ci bot

Loading

wangke19 Nov 17, 2025 •

edited

Loading