Skip to content

Conversation

@gangwgr
Copy link

@gangwgr gangwgr commented Nov 13, 2025

This commit fixes two important issues in the OTE framework:

  1. Respect suite's Parallelism field in run-suite command

    • Modified pkg/cmd/cmdrun/runsuite.go to check suite.Parallelism
    • If suite.Parallelism > 0, use it instead of command-line flag
    • This allows suites to enforce serial execution (Parallelism: 1)
    • Fixes issue where serial/slow suites ran tests in parallel
  2. Capture and preserve test output (g.By() calls)

    • Modified pkg/ginkgo/util.go to properly capture test logging
    • Use io.MultiWriter to write to both buffer and stderr
    • Filter out Ginkgo reporter summary lines from JSON output
    • Preserves real-time visibility while capturing programmatic output
    • Fixes issue where test output was empty in JSON results

These changes enable proper serial test execution and detailed logging for test suites using the OTE framework.

This commit fixes two important issues in the OTE framework:

1. **Respect suite's Parallelism field in run-suite command**
   - Modified pkg/cmd/cmdrun/runsuite.go to check suite.Parallelism
   - If suite.Parallelism > 0, use it instead of command-line flag
   - This allows suites to enforce serial execution (Parallelism: 1)
   - Fixes issue where serial/slow suites ran tests in parallel

2. **Capture and preserve test output (g.By() calls)**
   - Modified pkg/ginkgo/util.go to properly capture test logging
   - Use io.MultiWriter to write to both buffer and stderr
   - Filter out Ginkgo reporter summary lines from JSON output
   - Preserves real-time visibility while capturing programmatic output
   - Fixes issue where test output was empty in JSON results

These changes enable proper serial test execution and detailed logging
for test suites using the OTE framework.
@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Nov 13, 2025
@openshift-ci-robot
Copy link

openshift-ci-robot commented Nov 13, 2025

@gangwgr: This pull request references CNTRLPLANE-1724 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the sub-task to target the "4.21.0" version, but no target version was set.

Details

In response to this:

This commit fixes two important issues in the OTE framework:

  1. Respect suite's Parallelism field in run-suite command
  • Modified pkg/cmd/cmdrun/runsuite.go to check suite.Parallelism
  • If suite.Parallelism > 0, use it instead of command-line flag
  • This allows suites to enforce serial execution (Parallelism: 1)
  • Fixes issue where serial/slow suites ran tests in parallel
  1. Capture and preserve test output (g.By() calls)
  • Modified pkg/ginkgo/util.go to properly capture test logging
  • Use io.MultiWriter to write to both buffer and stderr
  • Filter out Ginkgo reporter summary lines from JSON output
  • Preserves real-time visibility while capturing programmatic output
  • Fixes issue where test output was empty in JSON results

These changes enable proper serial test execution and detailed logging for test suites using the OTE framework.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci openshift-ci bot requested review from deads2k and jupierce November 13, 2025 17:43
@openshift-ci
Copy link

openshift-ci bot commented Nov 13, 2025

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: gangwgr
Once this PR has been reviewed and has the lgtm label, please assign deads2k for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci
Copy link

openshift-ci bot commented Nov 13, 2025

Hi @gangwgr. Thanks for your PR.

I'm waiting for a github.com member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@openshift-ci openshift-ci bot added the needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. label Nov 13, 2025
Comment on lines 98 to 102
originalWriter := ginkgo.GinkgoWriter
ginkgo.GinkgoWriter = captureWriter
defer func() {
ginkgo.GinkgoWriter = originalWriter
}()

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here, we modify the global ginkgo.GinkgoWriter, remove this entirely - it's not needed! The captureWriter parameter passed to RunSpec is sufficient.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated


// Create a buffer to capture g.By() output AND write to stderr for real-time visibility
var outputBuffer bytes.Buffer
multiWriter := io.MultiWriter(&outputBuffer, os.Stderr)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here simplify by Using Ginkgo's Built-in Separation, the current approach creates a 40-line filter function, but Ginkgo already separates test output from reporter output!

// Just capture to buffer for stderr visibility
  var outputBuffer bytes.Buffer
  multiWriter := io.MultiWriter(&outputBuffer, os.Stderr)
  captureWriter := ginkgo.NewWriter(multiWriter)

  ginkgo.GetSuite().RunSpec(spec, ..., captureWriter, ...)

  // Use Ginkgo's pre-separated output - NO FILTERING NEEDED!
  result.Output = summary.CapturedGinkgoWriterOutput  // Already clean!
  result.Error = summary.CapturedStdOutErr

Copy link

@wangke19 wangke19 Nov 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Benefits:

  • ✅ Removes 40+ lines of filtering code (Delete filterGinkgoReporterOutput() function (40 lines removed!))
  • ✅ More robust (uses Ginkgo's API instead of text parsing)
  • ✅ Thread-safe (no global manipulation)
  • ✅ Future-proof (works with any Ginkgo version)
  • ✅ Still maintains real-time stderr visibility

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated

   - Remove redundant global GinkgoWriter modification
   - Use CapturedGinkgoWriterOutput directly instead of custom filtering
   - Remove 40-line filterGinkgoReporterOutput function
   - Cleaner code leveraging Ginkgo's native functionality

   Ginkgo already separates test output from reporter output in
   CapturedGinkgoWriterOutput, so the custom filtering is unnecessary.
   The captureWriter parameter passed to RunSpec is sufficient.
@wangke19
Copy link

/lgtm

@openshift-ci
Copy link

openshift-ci bot commented Nov 17, 2025

@wangke19: changing LGTM is restricted to collaborators

Details

In response to this:

/lgtm

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@gangwgr
Copy link
Author

gangwgr commented Nov 17, 2025

Suite results 
./cmd/cluster-kube-apiserver-operator-tests/cluster-kube-apiserver-operator-tests-ext \
    run-suite openshift/cluster-kube-apiserver-operator/conformance/parallel \
    --junit-path=/tmp/junit.xml
  Running Suite:  - /Users/rgangwar/Downloads/backupoffice/cluster-kube-apiserver-operator
  ========================================================================================
  Random Seed: 1763363798 - will randomize all specs

  Will run 1 of 1 specs
  ------------------------------
  [Jira:kube-apiserver][sig-api-machinery] parallel sanity test should always pass [Suite:openshift/cluster-kube-apiserver-operator/conformance/parallel]
  github.com/openshift/cluster-kube-apiserver-operator/test/e2e/debug.go:9
    STEP: Executing a simple parallel test @ 11/17/25 12:46:38.422
    STEP: Parallel test completed successfully @ 11/17/25 12:46:38.422
  • [0.001 seconds]
  ------------------------------

  Ran 1 of 1 Specs in 0.001 seconds
  SUCCESS! -- 1 Passed | 0 Failed | 0 Pending | 0 Skipped
[
  {
    "name": "[Jira:kube-apiserver][sig-api-machinery] parallel sanity test should always pass [Suite:openshift/cluster-kube-apiserver-operator/conformance/parallel]",
    "lifecycle": "blocking",
    "duration": 1,
    "startTime": "2025-11-17 07:16:38.421233 UTC",
    "endTime": "2025-11-17 07:16:38.422781 UTC",
    "result": "passed",
    "output": "  STEP: Executing a simple parallel test @ 11/17/25 12:46:38.422\n  STEP: Parallel test completed successfully @ 11/17/25 12:46:38.422\n"
  }
]%                                                                                                                                                                                                                               rgangwar@rgangwar-mac cluster-kube-apiserver-operator % ./cmd/cluster-kube-apiserver-operator-tests/cluster-kube-apiserver-operator-tests-ext \
    run-suite openshift/cluster-kube-apiserver-operator/conformance/serial \  
    --junit-path=/tmp/junit.xml
  Running Suite:  - /Users/rgangwar/Downloads/backupoffice/cluster-kube-apiserver-operator
  ========================================================================================
  Random Seed: 1763363838 - will randomize all specs

  Will run 1 of 1 specs
  ------------------------------
  [Jira:kube-apiserver][sig-api-machinery] serial sanity test should always pass [Serial][Suite:openshift/cluster-kube-apiserver-operator/conformance/serial]
  github.com/openshift/cluster-kube-apiserver-operator/test/e2e/debug.go:17
    STEP: Executing a simple serial test @ 11/17/25 12:47:18.543
    STEP: Serial test completed successfully @ 11/17/25 12:47:18.543
  • [0.001 seconds]
  ------------------------------

  Ran 1 of 1 Specs in 0.001 seconds
  SUCCESS! -- 1 Passed | 0 Failed | 0 Pending | 0 Skipped
[
  {
    "name": "[Jira:kube-apiserver][sig-api-machinery] serial sanity test should always pass [Serial][Suite:openshift/cluster-kube-apiserver-operator/conformance/serial]",
    "lifecycle": "blocking",
    "duration": 1,
    "startTime": "2025-11-17 07:17:18.542010 UTC",
    "endTime": "2025-11-17 07:17:18.543518 UTC",
    "result": "passed",
    "output": "  STEP: Executing a simple serial test @ 11/17/25 12:47:18.543\n  STEP: Serial test completed successfully @ 11/17/25 12:47:18.543\n"
  }
]%                                                                                                                                                                                                                               rgangwar@rgangwar-mac cluster-kube-apiserver-operator % ./cmd/cluster-kube-apiserver-operator-tests/cluster-kube-apiserver-operator-tests-ext run-suite "openshift/cluster-kube-apiserver-operator/optional/slow"     
[
  {
    "name": "[Jira:kube-apiserver][sig-api-machinery] slow suite sanity tests should pass slow test case 1 [Slow][Suite:openshift/cluster-kube-apiserver-operator/optional/slow]",
    "lifecycle": "blocking",
    "duration": 102,
    "startTime": "2025-11-17 07:25:40.572744 UTC",
    "endTime": "2025-11-17 07:25:40.674975 UTC",
    "result": "passed",
    "output": "  STEP: Starting slow test case 1 @ 11/17/25 12:55:40.573\n  STEP: Slow test case 1 completed with sleep simulation @ 11/17/25 12:55:40.674\n"
  },
  {
    "name": "[Jira:kube-apiserver][sig-api-machinery] slow suite sanity tests should pass slow test case 2 with timeout [Slow][Timeout:5m][Suite:openshift/cluster-kube-apiserver-operator/optional/slow]",
    "lifecycle": "blocking",
    "duration": 51,
    "startTime": "2025-11-17 07:25:40.686199 UTC",
    "endTime": "2025-11-17 07:25:40.737800 UTC",
    "result": "passed",
    "output": "  STEP: Starting slow test case 2 with timeout annotation @ 11/17/25 12:55:40.686\n  STEP: Slow test case 2 with timeout completed @ 11/17/25 12:55:40.737\n"
  },
  {
    "name": "[Jira:kube-apiserver][sig-api-machinery] slow suite sanity tests should pass slow serial test case 3 [Slow][Serial][Suite:openshift/cluster-kube-apiserver-operator/optional/slow]",
    "lifecycle": "blocking",
    "duration": 81,
    "startTime": "2025-11-17 07:25:40.748631 UTC",
    "endTime": "2025-11-17 07:25:40.830277 UTC",
    "result": "passed",
    "output": "  STEP: Starting slow serial test case 3 @ 11/17/25 12:55:40.748\n  STEP: Performing sequential operation 1 @ 11/17/25 12:55:40.748\n  STEP: Performing sequential operation 2 @ 11/17/25 12:55:40.83\n  STEP: Slow serial test case 3 completed successfully @ 11/17/25 12:55:40.83\n"
  },
  {
    "name": "[Jira:kube-apiserver][sig-api-machinery] slow suite sanity tests should pass slow test case 4 with multiple steps [Slow][Suite:openshift/cluster-kube-apiserver-operator/optional/slow]",
    "lifecycle": "blocking",
    "duration": 61,
    "startTime": "2025-11-17 07:25:40.841653 UTC",
    "endTime": "2025-11-17 07:25:40.903343 UTC",
    "result": "passed",
    "output": "  STEP: Step 1: Initialize test data @ 11/17/25 12:55:40.841\n  STEP: Step 2: Process test data @ 11/17/25 12:55:40.841\n  STEP: Step 3: Validate test data @ 11/17/25 12:55:40.903\n  STEP: Slow test case 4 completed with all steps @ 11/17/25 12:55:40.903\n"
  },
  {
    "name": "[Jira:kube-apiserver][sig-api-machinery] slow suite sanity tests should pass slow timeout test case 5 [Slow][Timeout:10m][Suite:openshift/cluster-kube-apiserver-operator/optional/slow]",
    "lifecycle": "blocking",
    "duration": 121,
    "startTime": "2025-11-17 07:25:40.913644 UTC",
    "endTime": "2025-11-17 07:25:41.035150 UTC",
    "result": "passed",
    "output": "  STEP: Starting slow timeout test case 5 @ 11/17/25 12:55:40.913\n  STEP: Simulating long-running operation @ 11/17/25 12:55:40.913\n  STEP: Long-running operation completed @ 11/17/25 12:55:41.034\n  STEP: Slow timeout test case 5 finished successfully @ 11/17/25 12:55:41.035\n"
  },
  {
    "name": "[Jira:kube-apiserver][sig-api-machinery] slow suite sanity tests should pass slow serial test case 6 with validation [Slow][Serial][Suite:openshift/cluster-kube-apiserver-operator/optional/slow]",
    "lifecycle": "blocking",
    "duration": 92,
    "startTime": "2025-11-17 07:25:41.047635 UTC",
    "endTime": "2025-11-17 07:25:41.139679 UTC",
    "result": "passed",
    "output": "  STEP: Starting slow serial test case 6 @ 11/17/25 12:55:41.047\n  STEP: Validating system state before test @ 11/17/25 12:55:41.048\n  STEP: Executing critical sequential operation @ 11/17/25 12:55:41.048\n  STEP: Validating system state after test @ 11/17/25 12:55:41.139\n  STEP: Slow serial test case 6 completed with all validations @ 11/17/25 12:55:41.139\n"
  }
]%                     

   This commit adds a new serial suite for testing sequential execution and includes comprehensive debug tests to validate the output capture improvements. The changes also remove suite-level parallelism override to ensure
    consistent behavior across test runs.
@gangwgr gangwgr force-pushed the fix-parallelism-and-logging branch from 007f65f to 0259757 Compare November 17, 2025 08:44
@rioliu-rh
Copy link
Contributor

Root Cause Investigation - Critical Finding

I tested the current code empirically with tests that actually produce output (using By() calls), and discovered something important:

The Current Code ALREADY WORKS

Test results:

# Test WITH By() calls - output IS captured:
./example-tests run-test "[sig-testing] openshift-tests-extension TEMP: test with By() output"
→ output_length: 226
→ Output contains all STEP messages from By() calls ✅

# Test WITHOUT By() calls - no output (expected):
./example-tests run-test "[sig-testing] openshift-tests-extension ordered should run beforeAll once"
→ output_length: 0
→ No output because test has no By() calls (just assertions) ✅

Both run-test and run-suite capture output correctly with the current code.

Why the PR Might Exist

Possible explanations:

  1. The bug was already fixed - Commit 00b85ac added ginkgo.GinkgoLogr = GinkgoLogrFunc(ginkgo.GinkgoWriter) which might have resolved the issue
  2. Misdiagnosis - The PR author tested with tests that don't produce output (no By() calls)
  3. Different scenario - There's a specific edge case we haven't tested

Impact on PR Review

Given that:

  • ✅ Current code captures output correctly
  • ❌ PR introduces unnecessary complexity (MultiWriter, outputBuffer, reporterConfigCopy)
  • ❌ PR has architectural issues (suite.Parallelism override)

Recommendation:

  • Close or reject this PR
  • If there IS a real bug, the PR author needs to provide a reproducible test case showing empty output with the current code
  • The example tests don't have By() calls, so they can't be used to verify this bug

Would you like me to provide my test code so you can verify the output capture bug doesn't exist in the current codebase?

@gangwgr gangwgr closed this Nov 20, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants