-
Notifications
You must be signed in to change notification settings - Fork 38
OCPBUGS-62964: Synchronize From Upstream Repositories #548
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
OCPBUGS-62964: Synchronize From Upstream Repositories #548
Conversation
…#2308) Analyze baseline memory usage patterns and adjust Prometheus alert thresholds to eliminate false positives while maintaining sensitivity to real issues. This is based on memory profiling done against BoxcutterRuntime, which has increased memory load. **Memory Analysis:** - Peak RSS: 107.9MB, Peak Heap: 54.74MB during e2e tests - Memory stabilizes at 106K heap (heap19-21 show 0K growth for 3 snapshots) - Conclusion: NOT a memory leak, but normal operational behavior **Memory Breakdown:** - JSON Deserialization: 24.64MB (45%) - inherent to OLM's dynamic nature - Informer Lists: 9.87MB (18%) - optimization possible via field selectors - OpenAPI Schemas: 3.54MB (6%) - already optimized (73% reduction) - Runtime Overhead: 53.16MB (49%) - normal for Go applications **Alert Threshold Updates:** - operator-controller-memory-growth: 100kB/sec → 200kB/sec - operator-controller-memory-usage: 100MB → 150MB - catalogd-memory-growth: 100kB/sec → 200kB/sec **Rationale:** Baseline profiling showed 132.4kB/sec episodic growth during informer sync and 107.9MB peak usage are normal. Previous thresholds caused false positive alerts during normal e2e test execution. **Verification:** - Baseline test (old thresholds): 2 alerts triggered (false positives) - Verification test (new thresholds): 0 alerts triggered ✅ - Memory patterns remain consistent (~55MB heap, 79-171MB RSS) - Transient spikes don't trigger alerts due to "for: 5m" clause **Recommendation:** Accept 107.9MB as normal operational behavior for test/development environments. Production deployments may need different thresholds based on workload characteristics (number of resources, reconciliation frequency). **Non-viable Optimizations:** - Cannot replace unstructured with typed clients (breaks OLM flexibility) - Cannot reduce runtime overhead (inherent to Go) - JSON deserialization is unavoidable for dynamic resource handling 🤖 Generated with [Claude Code](https://claude.com/claude-code) Signed-off-by: Todd Short <[email protected]> Co-authored-by: Claude <[email protected]>
Bumps [github.com/containerd/containerd](https://github.com/containerd/containerd) from 1.7.28 to 1.7.29. - [Release notes](https://github.com/containerd/containerd/releases) - [Changelog](https://github.com/containerd/containerd/blob/main/RELEASES.md) - [Commits](containerd/containerd@v1.7.28...v1.7.29) --- updated-dependencies: - dependency-name: github.com/containerd/containerd dependency-version: 1.7.29 dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
…t ) (#2306)" (#2312) This reverts commit b470947.
|
@openshift-bot: This pull request explicitly references no jira issue. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
@openshift-bot: GitHub didn't allow me to request PR reviews from the following users: openshift/openshift-team-operator-framework. Note that only openshift members and repo collaborators can review this PR, and authors cannot review their own PRs. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
c982efc to
37275f1
Compare
|
/verified bypass |
|
@jianzhangbjz: The In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
- Use Error level for all error conditions (validation, collision, engine failures) - Move verbose success reports to V(1) debug mode - Include full diagnostic reports when errors occur Co-authored-by: Todd Short <[email protected]>
Introduces automated memory and CPU profiling for e2e tests with: - Automatic port-forwarding to Kubernetes deployment pprof endpoints - Configurable periodic heap and CPU profile collection with differential timing - Analysis report generation with growth metrics and top allocators - Makefile targets: start-profiling, stop-profiling, analyze-profiles 🤖 Generated with [Claude Code](https://claude.com/claude-code) Signed-off-by: Todd Short <[email protected]> Co-authored-by: Claude <[email protected]>
|
/close |
|
@tmshort: Closed this PR. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
|
/repoen |
|
/reopen |
|
@tmshort: Reopened this PR. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
|
@openshift-bot: This pull request explicitly references no jira issue. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
/lgtm |
|
My mistake, it seems to be missing #544 |
|
/hold due to missing #544 |
…uess and waiting for k8s cleanups Co-Author: [email protected]
…nts ( Follow-Up of: 714977c )
… uninstall Assisted-by: Cursor
… format Fix k8s.io/kubernetes replace version from v1.30.1-0... to v0.0.0-... format to resolve bumper tool verification failures. Add hack/ocp-replace.sh script to manage OCP fork replaces properly. Assisted-by: Cursor
76972cb to
5de73b8
Compare
|
New changes are detected. LGTM label has been removed. |
|
/hold cancel /lgtm @tmshort now we can move forward :-) |
|
@openshift-bot: all tests passed! Full PR test history. Your PR dashboard. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
|
Now, we just need this verified... |
|
/label qe-approved |
|
@bandrade: This PR has been marked as verified by In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
/retitle: OCPBUGS-62964: Synchronize From Upstream Repositories |
|
@openshift-bot: Jira Issue OCPBUGS-62964 is in an unrecognized state (ON_QA) and will not be moved to the MODIFIED state. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
/retitle OCPBUGS-62964: Synchronize From Upstream Repositories |
The downstream repository has been updated with the following following upstream commits:
The
vendor/directory has been updated and the following commits were carried:This pull request is expected to merge without any human intervention. If tests are failing here, changes must land upstream to fix any issues so that future downstreaming efforts succeed.
/cc @openshift/openshift-team-operator-framework