Skip to content

Conversation

@openshift-cherrypick-robot

This is an automated cherry-pick of #864

/assign Miciah

Miciah added 3 commits April 24, 2023 16:16
Reformat ensureInternalIngressControllerService to follow the standard
pattern for "ensure" functions.

* pkg/operator/controller/ingress/controller.go (ensureIngressController):
Expect a Boolean return value from ensureInternalIngressControllerService,
indicating whether the service exists.
* pkg/operator/controller/ingress/internal_service.go
(ensureInternalIngressControllerService): Add a Boolean return value.  Use
a switch statement, and add a to-do comment for handling updates.
(currentInternalIngressControllerService): Add a Boolean return value.
Add logic to ensureInternalIngressControllerService to update the service
when an update is required.

* pkg/operator/controller/ingress/internal_service.go
(ensureInternalIngressControllerService): Use the new updateInternalService
method to update the service as needed.
(updateInternalService): Check whether the given ClusterIP service needs to
be updated, and update it if so, using the new internalServiceChanged
function.
(managedInternalServiceAnnotations): New variable with the set of
annotation keys for annotations that the operator manages for its internal
router services.
(internalServiceChanged): New function.  Check whether the current internal
service needs to be updated, and update it if it does, using the new
managedInternalServiceAnnotations variable.
* pkg/operator/controller/ingress/internal_service_test.go: New file.
(Test_desiredInternalIngressControllerService): New test.  Verify that
desiredInternalIngressControllerService returns the expected service.
(Test_internalServiceChanged): New test.  Verify that
internalServiceChanged correctly detects changes and performs updates.
Use the "metrics" port name instead of port number 1936 for the router
internal service's metrics port's target.

Before this commit, the router's internal service's metrics port always
targeted port 1936 on the router pod.  However, the router pod's metrics
port can be customized, and so it is not necessarily port 1936.  As a
consequence, the service's metrics port didn't work when the router pod's
metrics port was customized.  The router pod's metrics port always has the
name "metrics", so this commit changes the service to reference the port by
name to avoid breaking when the port number changes.

This commit fixes OCPBUGS-4573.

https://issues.redhat.com/browse/OCPBUGS-4573

Follow-up to commit af653f9.

* assets/router/service-internal.yaml: Use the "metrics" port name instead
of port 1936 for the metrics port's target.
* pkg/manifests/bindata.go: Regenerate.
* pkg/operator/controller/ingress/internal_service_test.go
(Test_desiredInternalIngressControllerService): Expect the metrics port to
reference the "metrics" target port by name.
(Test_internalServiceChanged): Add a test case for changing the "metrics"
port's target port from an integer to a string.
@openshift-ci-robot
Copy link
Contributor

@openshift-cherrypick-robot: Jira Issue OCPBUGS-4573 has been cloned as Jira Issue OCPBUGS-12464. Will retitle bug to link to clone.
/retitle [release-4.12] OCPBUGS-12464: Target metrics port by name in internal service

Details

In response to this:

This is an automated cherry-pick of #864

/assign Miciah

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-ci openshift-ci bot changed the title [release-4.12] OCPBUGS-4573: Target metrics port by name in internal service [release-4.12] OCPBUGS-12464: Target metrics port by name in internal service Apr 24, 2023
@openshift-ci-robot openshift-ci-robot added jira/severity-important Referenced Jira bug's severity is important for the branch this PR is targeting. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. jira/valid-bug Indicates that a referenced Jira bug is valid for the branch this PR is targeting. bugzilla/valid-bug Indicates that a referenced Bugzilla bug is valid for the branch this PR is targeting. labels Apr 24, 2023
@openshift-ci-robot
Copy link
Contributor

@openshift-cherrypick-robot: This pull request references Jira Issue OCPBUGS-12464, which is valid. The bug has been moved to the POST state.

6 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target version (4.12.z) matches configured target version for branch (4.12.z)
  • bug is in the state New, which is one of the valid states (NEW, ASSIGNED, POST)
  • dependent bug Jira Issue OCPBUGS-4573 is in the state Closed (Done), which is one of the valid states (VERIFIED, RELEASE PENDING, CLOSED (ERRATA), CLOSED (CURRENT RELEASE), CLOSED (DONE))
  • dependent Jira Issue OCPBUGS-4573 targets the "4.13.0" version, which is one of the valid target versions: 4.13.0
  • bug has dependents

Requesting review from QA contact:
/cc @lihongan

The bug has been updated to refer to the pull request using the external bug tracker.

Details

In response to this:

This is an automated cherry-pick of #864

/assign Miciah

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Apr 24, 2023

@openshift-cherrypick-robot: No Bugzilla bug is referenced in the title of this pull request.
To reference a bug, add 'Bug XXX:' to the title of this pull request and request another bug refresh with /bugzilla refresh.

Details

In response to this:

[release-4.12] OCPBUGS-12464: Target metrics port by name in internal service

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@Miciah
Copy link
Contributor

Miciah commented Apr 25, 2023

e2e-gcp-ovn-serial failed because the "events should not repeat pathologically" test failed because of missing node annotations, which is being tracked as OCPBUGS-10841.
/test e2e-gcp-ovn-serial

e2e-aws-ovn-single-node failed because of multiple test failures:

  • [sig-node] Kubelet when scheduling a busybox command in a pod should print the output to logs
{  fail [k8s.io/kubernetes@v1.25.0/test/e2e/common/node/kubelet.go:79]: Timed out after 60.003s.
Expected
    <string>: time=\"2023-04-24T17:03:56Z\" level=warning msg=\"skipping device /dev/char/10:200 for systemd: stat /sys/dev/char/10:200: no such file or directory\"\nHello World\n
to equal
    <string>: Hello World\n
Ginkgo exit error 1: exit with code 1}
  • [sig-node] Container Runtime blackbox test on terminated container should report termination message from log output if TerminationMessagePolicy FallbackToLogsOnError is set
{  fail [k8s.io/kubernetes@v1.25.0/test/e2e/common/node/runtime.go:167]: Expected     <string>: time=\"2023-04-24T17:07:35Z\" level=warning msg=\"skipping device /dev/char/10:200 for systemd: stat /sys/dev/char/10:200: no such file or directory\"\nDONE to equal     <string>: DONE Ginkgo exit error 1: exit with code 1}
  • [sig-node] Pods should support retrieving logs from the container over websockets
{  fail [github.com/onsi/ginkgo/v2@v2.1.5-0.20220909190140-b488ab12695a/internal/suite.go:612]: Apr 24 17:12:57.845: Unexpected websocket logs:
time="2023-04-24T17:12:55Z" level=warning msg="skipping device /dev/char/10:200 for systemd: stat /sys/dev/char/10:200: no such file or directory"
container is alive

Ginkgo exit error 1: exit with code 1}
  • [sig-node] Kubelet when scheduling a read only busybox container should not write to root filesystem
{  fail [k8s.io/kubernetes@v1.25.0/test/e2e/common/node/kubelet.go:214]: Timed out after 60.002s.
Expected
    <string>: "time="..."
to equal       |
    <string>: "/bin/s..."
Ginkgo exit error 1: exit with code 1}

/test e2e-aws-ovn-single-node

e2e-azure-ovn failed because of the same test failures as e2e-aws-ovn-single-node.
/test e2e-azure-ovn

e2e-aws-ovn failed because of the same test failures as e2e-aws-ovn-single-node.
/test e2e-aws-ovn

@Miciah
Copy link
Contributor

Miciah commented Apr 25, 2023

e2e-aws-ovn-single-node failed because the same four tests failed. I found that these tests are failing for many other PRs, so I filed OCPBUGS-12746 to track those failures.
/test e2e-aws-ovn-single-node

e2e-azure-ovn failed with the same four failures.
/test e2e-azure-ovn

e2e-gcp-ovn-serial failed because of OCPBUGS-10841.
/test e2e-gcp-ovn-serial

e2e-aws-ovn failed because AWS is full:

level=error msg=2023-04-25T18:09:20.239Z [ERROR] provider.terraform-provider-aws: Response contains error diagnostic: tf_resource_type=aws_instance tf_rpc=ApplyResourceChange @caller=/go/src/github.com/openshift/installer/terraform/providers/aws/vendor/github.com/hashicorp/terraform-plugin-go/tfprotov5/internal/diag/diagnostics.go:55 @module=sdk.proto diagnostic_detail= diagnostic_summary="creating EC2 Instance: InsufficientInstanceCapacity: We currently do not have sufficient m6a.xlarge capacity in the Availability Zone you requested (us-east-1d). Our system will be working on provisioning additional capacity. You can currently get m6a.xlarge capacity by not specifying an Availability Zone in your request or choosing us-east-1a, us-east-1b, us-east-1c, us-east-1f.
level=error msg=	status code: 500, request id: e882f978-8ebe-423a-974b-615b4df9ab3f" tf_provider_addr=registry.terraform.io/hashicorp/aws diagnostic_severity=ERROR tf_proto_version=5.3 tf_req_id=5fb778e6-cb3c-4920-0118-0b1c6ed37714 timestamp=2023-04-25T18:09:20.239Z
level=error
level=error msg=Error: creating EC2 Instance: InsufficientInstanceCapacity: We currently do not have sufficient m6a.xlarge capacity in the Availability Zone you requested (us-east-1d). Our system will be working on provisioning additional capacity. You can currently get m6a.xlarge capacity by not specifying an Availability Zone in your request or choosing us-east-1a, us-east-1b, us-east-1c, us-east-1f.
level=error msg=	status code: 500, request id: e882f978-8ebe-423a-974b-615b4df9ab3f 

/test e2e-aws-ovn

This backport is a clean cherry-pick, and it has been approved by PM.
/approve
/lgtm

The changes are fairly well isolated, include good test coverage, and do not appear to pose significant risk.
/label backport-risk-assessed

@openshift-ci openshift-ci bot added the backport-risk-assessed Indicates a PR to a release branch has been evaluated and considered safe to accept. label Apr 25, 2023
@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Apr 25, 2023
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Apr 25, 2023

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: Miciah

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Apr 25, 2023
@ShudiLi
Copy link
Member

ShudiLi commented Apr 27, 2023

/label cherry-pick-approved
thanks

@openshift-ci openshift-ci bot added the cherry-pick-approved Indicates a cherry-pick PR into a release branch has been approved by the release branch manager. label Apr 27, 2023
@openshift-ci-robot
Copy link
Contributor

/retest-required

Remaining retests: 0 against base HEAD 6f3e22f and 2 for PR HEAD 4dc3f0a in total

@Miciah
Copy link
Contributor

Miciah commented Apr 28, 2023

e2e-aws-ovn-single-node failed and e2e-azure-ovn failed because of the same failures as before; waiting for OCPBUGS-12746 and OCPBUGS-12688 to be verified.

e2e-gcp-ovn-serial failed because of OCPBUGS-10841 "CI fails on "events should not repeat pathologically" because of missing node annotations". I have updated the bug report to communicate that it is affecting 4.12 (the original bug report indicated that only 4.14 and 4.13 were affected).
/test e2e-gcp-ovn-serial

@sreber84
Copy link

/retest

@openshift-ci
Copy link
Contributor

openshift-ci bot commented May 15, 2023

@openshift-cherrypick-robot: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/e2e-aws-ovn-single-node 4dc3f0a link false /test e2e-aws-ovn-single-node

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@openshift-merge-robot openshift-merge-robot merged commit 21c38d5 into openshift:release-4.12 May 15, 2023
@openshift-ci-robot
Copy link
Contributor

@openshift-cherrypick-robot: Jira Issue OCPBUGS-12464: All pull requests linked via external trackers have merged:

Jira Issue OCPBUGS-12464 has been moved to the MODIFIED state.

Details

In response to this:

This is an automated cherry-pick of #864

/assign Miciah

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-merge-robot
Copy link
Contributor

Fix included in accepted release 4.12.0-0.nightly-2023-05-15-222942

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. backport-risk-assessed Indicates a PR to a release branch has been evaluated and considered safe to accept. bugzilla/valid-bug Indicates that a referenced Bugzilla bug is valid for the branch this PR is targeting. cherry-pick-approved Indicates a cherry-pick PR into a release branch has been approved by the release branch manager. jira/severity-important Referenced Jira bug's severity is important for the branch this PR is targeting. jira/valid-bug Indicates that a referenced Jira bug is valid for the branch this PR is targeting. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. lgtm Indicates that a PR is ready to be merged.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants