Skip to content

Bug 1756920: fluentd pods do not process kubernetes events#1766

Merged
openshift-merge-robot merged 1 commit intoopenshift:masterfrom
richm:bz1756920-eventrouter-support-4x
Oct 16, 2019
Merged

Bug 1756920: fluentd pods do not process kubernetes events#1766
openshift-merge-robot merged 1 commit intoopenshift:masterfrom
richm:bz1756920-eventrouter-support-4x

Conversation

@richm
Copy link
Contributor

@richm richm commented Oct 7, 2019

https://bugzilla.redhat.com/show_bug.cgi?id=1756920
This tries to keep the functionality the same in 4.x as it was
in 3.x. In 3.x, you had to:

  • deploy the eventrouter with a pod name like "logging-eventrouter-*"
  • deploy the eventrouter in the default namespace
  • set TRANSFORM_EVENTS=true
  • set MERGE_JSON_LOG=true (which was the default in 3.x)

In 4.x, the pod name changes to "eventrouter-*" to follow the convention
of our other logging pods. The eventrouter defaults to running
in "openshift-logging" - it doesn't really matter, as long as it is
running in an "infra" namespace. This change also enables
MERGE_JSON_LOG=true for eventrouter pods.
The biggest problem is that you have to set the cluster to unmanaged
in order to set TRANSFORM_EVENTS=true. We could workaround that
by making TRANSFORM_EVENTS=true the default value. That would
cause every kubernetes record to be checked to see if it looks like
and event, and process it as such if it matches. I'm not sure what
the performance implications would be.
Also ports eventrouter test to 4.x

@openshift-ci-robot
Copy link

@richm: An error was encountered searching for bug 1756920 on the Bugzilla server at https://bugzilla.redhat.com:

did not get one bug, but 0: {[]}
Please contact an administrator to resolve this issue, then request a bug refresh with /bugzilla refresh.

Details

In response to this:

Bug 1756920: fluentd pods do not process kubernetes events

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-ci-robot openshift-ci-robot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Oct 7, 2019
@openshift-ci-robot openshift-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Oct 7, 2019
json_fields "#{ENV['JSON_FIELDS'] || 'log,MESSAGE'}"
</filter>

<filter kubernetes.var.log.containers.eventrouter-** kubernetes.var.log.containers.cluster-logging-eventrouter-**>
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cluster-logging-eventrouter is to workaround a "bug" in the current documentation - https://docs.openshift.com/container-platform/4.1/logging/efk-logging-eventrouter.html - it really should be eventrouter

@richm richm force-pushed the bz1756920-eventrouter-support-4x branch 2 times, most recently from 3dbf294 to ebfc2fd Compare October 7, 2019 22:25
@jcantrill
Copy link
Contributor

I wonder if we should consider waiting until openshift/cluster-logging-operator#231 merges or if its urgent enough using this PR against a 4.2 release branch

@richm
Copy link
Contributor Author

richm commented Oct 8, 2019

/retest

@nhosoi
Copy link
Contributor

nhosoi commented Oct 8, 2019

/lgtm

@openshift-ci-robot openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Oct 8, 2019
@richm richm added area/documentation component/fluentd kind/bug Categorizes issue or PR as related to a bug. release/4.3 labels Oct 8, 2019
@nhosoi
Copy link
Contributor

nhosoi commented Oct 8, 2019

/retest

@nhosoi
Copy link
Contributor

nhosoi commented Oct 8, 2019

A version error?

+ cp -r /tmp/artifacts/eo/manifests/4.2 /tmp/tmp.60j5j3akEd
cp: cannot stat '/tmp/artifacts/eo/manifests/4.2': No such file or directory

Maybe, most 4.2 is okay, except the last one?

build-images.sh:      if [ -n "${GOPATH:-}" -a -f ${GOPATH:-}/src/github.com/openshift/release/ci-operator/infra/openshift/release-controller/repos/ocp-4.2-default.repo ]; then
build-images.sh:        cp ${GOPATH:-}/src/github.com/openshift/release/ci-operator/infra/openshift/release-controller/repos/ocp-4.2-default.repo repos
build-images.sh:        cp release/ci-operator/infra/openshift/release-controller/repos/ocp-4.2-default.repo repos
build-images.sh:  elif [ ! -f $INTERNAL_REPO_DIR/ops-mirror.pem ] || [ ! -f $INTERNAL_REPO_DIR/ocp-4.2-default.repo -a ! -f $INTERNAL_REPO_DIR/ocp-4.1-default.repo -a ! -f $INTERNAL_REPO_DIR/ocp-4.0-default.repo ] ; then
build-images.sh:    echo ERROR: $INTERNAL_REPO_DIR missing one of ops-mirror.pem or ocp-4.2-default.repo and ocp-4.1-default.repo and ocp-4.0-default.repo
deploy-logging.sh:MASTER_VERSION=${MASTER_VERSION:-4.2}

@richm
Copy link
Contributor Author

richm commented Oct 8, 2019

A version error?

Nothing is going to pass until #1765 merges, which is dependent on openshift/elasticsearch-operator#191

@openshift-bot
Copy link

/retest

Please review the full test history for this PR and help us cut down flakes.

@richm richm added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Oct 8, 2019
@richm richm force-pushed the bz1756920-eventrouter-support-4x branch from a28e9b8 to 25037a7 Compare October 9, 2019 15:19
@openshift-ci-robot openshift-ci-robot removed the lgtm Indicates that a PR is ready to be merged. label Oct 9, 2019
@richm richm removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Oct 9, 2019
@richm
Copy link
Contributor Author

richm commented Oct 9, 2019

please re-review

@richm richm force-pushed the bz1756920-eventrouter-support-4x branch from 25037a7 to 7f6cdb4 Compare October 9, 2019 17:43
@richm
Copy link
Contributor Author

richm commented Oct 14, 2019

/retest

@richm richm force-pushed the bz1756920-eventrouter-support-4x branch 2 times, most recently from ce45207 to 0a87591 Compare October 14, 2019 23:36
@richm
Copy link
Contributor Author

richm commented Oct 15, 2019

now passing - tests fixed - please review

@richm
Copy link
Contributor Author

richm commented Oct 15, 2019

/retest

@richm
Copy link
Contributor Author

richm commented Oct 15, 2019

problems in CI framework - not sure if related to 3.11 problems
/retest

https://bugzilla.redhat.com/show_bug.cgi?id=1756920
This tries to keep the functionality the same in 4.x as it was
in 3.x.  In 3.x, you had to:

- deploy the eventrouter with a pod name like "logging-eventrouter-*"
- deploy the eventrouter in the default namespace
- set `TRANSFORM_EVENTS=true`
- set `MERGE_JSON_LOG=true` (which was the default in 3.x)

In 4.x, the pod name changes to "eventrouter-*" to follow the convention
of our other logging pods.  The eventrouter defaults to running
in "openshift-logging" - it doesn't really matter, as long as it is
running in an "infra" namespace.  This change also enables
`MERGE_JSON_LOG=true` for eventrouter pods.
The biggest problem is that you have to set the cluster to unmanaged
in order to set `TRANSFORM_EVENTS=true`.  We could workaround that
by making `TRANSFORM_EVENTS=true` the default value.  That would
cause every kubernetes record to be checked to see if it looks like
and event, and process it as such if it matches.  I'm not sure what
the performance implications would be.
Also ports eventrouter test to 4.x
Also adds a test for eventrouter Info type support

update eventrouter to pick up fix for Bug 1701495
@richm richm force-pushed the bz1756920-eventrouter-support-4x branch from 0a87591 to c017949 Compare October 15, 2019 15:28
@richm
Copy link
Contributor Author

richm commented Oct 15, 2019

flake - vpc limit reached
/retest

@richm
Copy link
Contributor Author

richm commented Oct 15, 2019

multi-tenancy flake
/retest

@richm
Copy link
Contributor Author

richm commented Oct 15, 2019

test-zzz-correct-index-names flake? ErrImagePull?
/retest

@richm
Copy link
Contributor Author

richm commented Oct 15, 2019

ok - finally passed, no flakes - please review

@nhosoi
Copy link
Contributor

nhosoi commented Oct 16, 2019

/lgtm

@openshift-ci-robot openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Oct 16, 2019
@openshift-ci-robot
Copy link

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: nhosoi, richm

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-merge-robot openshift-merge-robot merged commit b54a318 into openshift:master Oct 16, 2019
@openshift-ci-robot
Copy link

@richm: An error was encountered searching for bug 1756920 on the Bugzilla server at https://bugzilla.redhat.com:

did not get one bug, but 0: {[]}
Please contact an administrator to resolve this issue, then request a bug refresh with /bugzilla refresh.

Details

In response to this:

Bug 1756920: fluentd pods do not process kubernetes events

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@richm richm deleted the bz1756920-eventrouter-support-4x branch October 16, 2019 13:25
@richm
Copy link
Contributor Author

richm commented Oct 16, 2019

/cherrypick release-4.2

@openshift-cherrypick-robot

@richm: new pull request created: #1768

Details

In response to this:

/cherrypick release-4.2

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. area/documentation component/fluentd kind/bug Categorizes issue or PR as related to a bug. lgtm Indicates that a PR is ready to be merged. release/4.3 size/L Denotes a PR that changes 100-499 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants