Skip to content

Conversation

@nhosoi
Copy link
Contributor

@nhosoi nhosoi commented Aug 16, 2019

In the Cluster Logging instance deletion, resources under Cluster Logging
were deleted independently, which could leave a short time period when the
fluentd or rsyslog daemonset was still running but its logcollector service
account was removed. That made the logs processed in the time miss kubernetes
information including the namespace name.

In this patch, the following changes are made:

  • Setting foregroundDeletion finalizer to collector service account
  • Making the ownerReference of fluentd and rsyslog daemonset the collector service account
  • Setting blockOwnerDeletion to true in the ownerReference
  • The foregroundDeletion informs whether the collector service account is being deleted
    or not via reconciler. If it is being deleted, letting the collector service account
    deletion wait until the other objects are all removed, then, delete the finalizer and
    delete the service account.

@nhosoi nhosoi added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Aug 16, 2019
@openshift-ci-robot openshift-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Aug 16, 2019
@openshift-ci-robot
Copy link

@nhosoi: This pull request references a valid Bugzilla bug. The bug has been moved to the POST state. The bug has been updated to refer to the pull request using the external bug tracker.

Details

In response to this:

Bug 1715370: Some orphaned logs sent to .operations.* index by mistake.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-ci-robot openshift-ci-robot added bugzilla/valid-bug Indicates that a referenced Bugzilla bug is valid for the branch this PR is targeting. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. approved Indicates a PR has been approved by an approver from all required OWNERS files. labels Aug 16, 2019
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The UID in the logcollector service account is set when it is created. So, I store it to use it in ownerReference later here. If there is an api to retrieve it dynamically, we don't have to store the UID this way... But so far I could not find it.

sample logcollector service account

apiVersion: v1
imagePullSecrets:
- name: logcollector-dockercfg-rgwsp
kind: ServiceAccount
metadata:
  creationTimestamp: "2019-08-15T22:12:46Z"
  finalizers:
  - foregroundDeletion
  name: logcollector
  namespace: openshift-logging
  ownerReferences:
  - apiVersion: logging.openshift.io/v1
    controller: true
    kind: ClusterLogging
    name: instance
    uid: cec9ff31-bfa9-11e9-b67d-029422b5c914
  resourceVersion: "468364"
  selfLink: /api/v1/namespaces/openshift-logging/serviceaccounts/logcollector
  uid: cfa30bf4-bfa9-11e9-a7b2-0a1781cc6f5c

@nhosoi
Copy link
Contributor Author

nhosoi commented Aug 16, 2019

/test e2e-aws

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

     if delfinalizer {

it's already a bool

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Fixed.

Copy link
Contributor

@richm richm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

other than a minor nit I don't see any obvious errors
have you tested this with the origin-aggregated-logging CI tests?

In the Cluster Logging instance deletion, resources under Cluster Logging
were deleted independently, which could leave a short time period when the
fluentd or rsyslog daemonset was still running but its logcollector service
account was removed.  That made the logs processed in the time miss kubernetes
information including the namespace name.

In this patch, the following changes are made:
- Setting foregroundDeletion finalizer to collector service account
- Making the ownerReference of fluentd and rsyslog daemonset the collector service account
- Setting blockOwnerDeletion to true in the ownerReference
- The foregroundDeletion informs whether the collector service account is being deleted
  or not via reconciler. If it is being deleted, letting the collector service account
  deletion wait until the other objects are all removed, then, delete the finalizer and
  delete the service account.
@nhosoi
Copy link
Contributor Author

nhosoi commented Aug 19, 2019

have you tested this with the origin-aggregated-logging CI tests?

Yes! (^o^)/

+ echo finished hack/test-logging.sh at 2019-08-18 19:22:53-07:00
finished hack/test-logging.sh at 2019-08-18 19:22:53-07:00
+ '[' PASS = PASS ']'

- configmaps
- secrets
- serviceaccounts
- serviceaccounts/finalizers
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jcantrill @ewolinetz do we need to add this rbac anywhere else? Is this CSV the one used by customers?

@richm
Copy link
Contributor

richm commented Aug 20, 2019

/lgtm

@openshift-ci-robot openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Aug 20, 2019
@openshift-ci-robot
Copy link

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: nhosoi, richm

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-merge-robot openshift-merge-robot merged commit b8c17f2 into openshift:master Aug 20, 2019
@openshift-ci-robot
Copy link

@nhosoi: All pull requests linked via external trackers have merged. Bugzilla bug 1715370 has been moved to the MODIFIED state.

Details

In response to this:

Bug 1715370: Some orphaned logs sent to .operations.* index by mistake.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

pmoogi-redhat pushed a commit to pmoogi-redhat/cluster-logging-operator that referenced this pull request Apr 26, 2022
Bug 1715370: Some orphaned logs sent to .operations.* index by mistake.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. bugzilla/valid-bug Indicates that a referenced Bugzilla bug is valid for the branch this PR is targeting. lgtm Indicates that a PR is ready to be merged. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants