Add Performance Profile Controler must-gather#341
Add Performance Profile Controler must-gather#341marioferh wants to merge 4 commits intoopenshift:masterfrom
Conversation
…penshift-4.12-ose-must-gather Updating ose-must-gather images to be consistent with ART
Signed-off-by: Mario Fernandez <mariofer@redhat.com>
Signed-off-by: Mario Fernandez <mariofer@redhat.com>
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: marioferh The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
f99984e to
cdb4c64
Compare
sferich888
left a comment
There was a problem hiding this comment.
This looks great; however there is quite a bit I would like to change or have you consider.
| metadata: | ||
| name: perf-node-gather-pods-reader | ||
| subjects: | ||
| - kind: ServiceAccount |
There was a problem hiding this comment.
Isn't this something we provide in its own file?
There was a problem hiding this comment.
I don't understand the comment :S
| namespace: perf-node-gather | ||
| apiGroup: "" | ||
| roleRef: | ||
| kind: ClusterRole |
There was a problem hiding this comment.
Isn't this something we are providing int its own file?
|
|
||
| sed -i -e "s#MUST_GATHER_IMAGE#$MUST_GATHER_IMAGE#" $DAEMONSET_MANIFEST | ||
|
|
||
| oc create -f $NAMESPACE_MANIFEST |
There was a problem hiding this comment.
Can we simply all of the oc create commands by using oc apply?
There was a problem hiding this comment.
We need to do it with oc create to apply policies after the SA creation:
oc adm policy add-scc-to-user privileged -n $NAMESPACE -z perf-node-gather
|
|
||
| sed -i -e "s#MUST_GATHER_IMAGE#$MUST_GATHER_IMAGE#" $DAEMONSET_MANIFEST | ||
|
|
||
| oc create -f $NAMESPACE_MANIFEST |
There was a problem hiding this comment.
Can we 'reuse' the exsisting must-gather namespace; instead of creating a new namespace?
By putting all the collection artifacts in the 'must-gather' namespace; when the oc command that invokes must-gather finishes; a clean up operation is done (removing all of the pods, and other artifacts that we create).
By spawning off new namespaces; it's possible that we could leave around lingering artifacts.
There was a problem hiding this comment.
I am facing this issue if I try to reuse same namespace:
Warning FailedCreate 33s (x15 over 115s) daemonset-controller Error creating: pods "perf-node-gather-daemonset-" is forbidden: autoscaling.openshift.io/ManagementCPUsOverride the pod namespace "openshift-must-gather-f2s49" does not allow the workload type management
| done | ||
| wait "${ADM_PIDS[@]}" | ||
|
|
||
| oc delete -f $DAEMONSET_MANIFEST |
There was a problem hiding this comment.
I don't think we need to do this; if we rely on the invoking oc command to clean things up for us.
cdb4c64 to
e69b638
Compare
marioferh
left a comment
There was a problem hiding this comment.
The new change create sysinfo binary inside the repo to not depend of external repo as before
7f46e5b to
9533b74
Compare
| name: release | ||
| namespace: openshift | ||
| tag: rhel-8-release-golang-1.18-openshift-4.12 | ||
| tag: rhel-8-release-golang-1.19-openshift-4.12 |
There was a problem hiding this comment.
rhel-8-release-golang-1.19-openshift-4.13 would be the correct one here and everywhere else too.
Dockerfile.rhel7
Outdated
| FROM registry.ci.openshift.org/ocp/builder:rhel-8-golang-1.18-openshift-4.12 AS builder | ||
| FROM registry.ci.openshift.org/ocp/builder:rhel-8-golang-1.19-openshift-4.12 AS builder | ||
|
|
||
| RUN dnf install -y git make go |
There was a problem hiding this comment.
Why you're doing this, all of those tools are already part of the builder image.
Dockerfile.rhel7
Outdated
| COPY ${NODE_GATHER_MANIFESTS_DIR} /etc/performance-profile-node-gather | ||
|
|
||
| # the binary follows the must-gather convention of helper tools called gather_$SOMETHING | ||
| COPY build/_output/bin/gather-sysinfo /usr/bin/gather_sysinfo No newline at end of file |
There was a problem hiding this comment.
I don't quite understand the lines above, what are they doing?
There was a problem hiding this comment.
After generate a binary with make we need to copy it to the image
Makefile
Outdated
| dist-gather-sysinfo: build-output-dir | ||
| @if [ ! -x $(TOOLS_BIN_DIR)/gather-sysinfo ]; then\ | ||
| echo "Building gather-sysinfo helper";\ | ||
| env CGO_ENABLED=0 GOOS=$(TARGET_GOOS) GOARCH=$(TARGET_GOARCH) go build -ldflags="-s -w" -mod=vendor -o $(TOOLS_BIN_DIR)/gather-sysinfo ./pkg/tools/gather-sysinfo/;\ |
There was a problem hiding this comment.
Why you're building it? From the email thread it looked like you already have a package ready that you just install?
There was a problem hiding this comment.
I do not think we have ever said that. We are currently building it as part of the separate must gather image we ship, but there never was an RPM as far as I know.
| read desired ready <<< $line | ||
| IFS=$'\n' | ||
|
|
||
| if [[ "$desired" != "0" ]] && [[ "$ready" == "$desired" ]] |
There was a problem hiding this comment.
I second @sferich888 here, it's more reasonable to have some wait period before failing.
| oc adm policy add-scc-to-user privileged -n $NAMESPACE -z perf-node-gather | ||
| oc create -f $CLUSTER_ROLE_MANIFEST | ||
| oc create -f $CLUSTER_ROLE_BINDING_MANIFEST | ||
| oc create -f $DAEMONSET_MANIFEST |
There was a problem hiding this comment.
You can't create any of these, b/c must-gather CLI won't be able to clean it up. You should rather re-use the namespace and the privileged SA created for must-gather pod.
| go 1.17 | ||
|
|
||
| require github.com/openshift/build-machinery-go v0.0.0-20210423112049-9415d7ebd33e | ||
| require ( |
There was a problem hiding this comment.
Like mentioned before, I don't think we want to handle it here, it should be built outside of this repo and only installed here.
There was a problem hiding this comment.
That would require an RPM in the package repos. The tool is specific to the must gather data collection though. Also we want the tool to be always in sync with the MG and OCP versions. Building it here is actually much much easier and more reliable.
There was a problem hiding this comment.
Based on what I was told it is already built as RPM so that shouldn't be the problem.
9533b74 to
6cbcd89
Compare
|
@marioferh: The following tests failed, say
Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
|
openshift/oc#1316 oc PR to add required annotations to must-gather annotatoins |
|
Issues go stale after 90d of inactivity. Mark the issue as fresh by commenting If this issue is safe to close now please do so with /lifecycle stale |
|
PR needs rebase. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
|
/close |
|
@sferich888: Closed this PR. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
Description
Launch Performance Profile Controller data collecting in OCP must-gather
Goal
Remove a standalone PAO must-gather image and make it part of the core OCP must-gather.
Before, the command for the client to run must-gather and PPC must-gather.
oc adm must-gather
--image-stream=openshift/must-gather \
--image=[...]/performance-addon-operator-must-gather-rhel8:$VERSION
Now:
oc adm must-gather --image=openshift/must-gather:$VERSION /usr/bin/gather_performance_profile_controller
Type of change
Needed changes
Change temp
ARG REPO=github.com/marioferh/gather-sysinfoto actual repo before merge.PAO = Performance Addon Operator now is PPC = Performance Profile Controller after this change: openshift/cluster-node-tuning-operator#322