-
Notifications
You must be signed in to change notification settings - Fork 533
describe e2e observer pods #425
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
describe e2e observer pods #425
Conversation
stevekuznetsov
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we expect some or any of the following:
- multiple observers per cluster
- multiple instances of one observer against a single test cluster
- the container image running the observer pod to be build from the repository under test
|
|
||
| ### Changes to CI | ||
| This is a sketch of a possible path. | ||
| 1. Allow a non-dptp-developer to produces a pod template that mounts a `secret/<job>-kubeconfig` that will later contain a kubeconfig. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why does it need to be a Pod template? Our current interface for developers looks like a (container image, commands) tuple. Is that sufficient? What do we need users to be able to configure?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why does it need to be a Pod template? Our current interface for developers looks like a (container image, commands) tuple. Is that sufficient? What do we need users to be able to configure?
I'm not sure. But I do know that podtemplates are standard kube structs, with their existing documentation and clear semantics. Resource requirements come to mind. Env vars probably.
I think it's actually easier for me to provide a podtemplate since my development is likely to be done using a pod.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Most if not all of those points of variance are already handled by existing types in the CI system. I don't want to expose e.g. CAP_SYS_ADMIN or host mount ...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is important. s/pod template/step reference/g
| ### Changes to CI | ||
| This is a sketch of a possible path. | ||
| 1. Allow a non-dptp-developer to produces a pod template that mounts a `secret/<job>-kubeconfig` that will later contain a kubeconfig. | ||
| 2. When the CI pod starts today (the one with setup containers and stuff), also create an instance of each pod template, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is too specific and not useful. It is not important for the parallel observer process to start at the time that the job starts.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the important requirement here that the process be running and ready to notice $KUBECONFIG being present when it's possible?
| let's say `pod/<job>-resourcewatcher` and an empty `secret/<job>-kubeconfig`. | ||
| 3. As soon as a kubeconfig is available (these go into a known location in setup container today), write that | ||
| `kubeconfig` into every `secret/<job>-kubeconfig` (you probably want to label them). | ||
| 4. When it is time for collection, the existing pod (I think it's teardown container), writes a new data entry at `.data['teardown']` into every |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is important about this mechanism?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is important about this mechanism?
It's a simple notification that meets the min of, "easy to provide and easy to observe in an e2e-observer pod". I don't really how, but this way is easy.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why do the monitors need to know the teardown time? Won't it be enough for them to record "cluster seems to have disappeared" and rely on test-suite failures to get folks to look at jobs where the cluster expoded before teardown.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why do the monitors need to know the teardown time? Won't it be enough for them to record "cluster seems to have disappeared" and rely on test-suite failures to get folks to look at jobs where the cluster expoded before teardown.
collection can be expensive. You may want to defer until the end. Imagine using this for must-gather so we can actually get it on all CI runs. Amazingly, we have failed to achieve even that collection in a consistent way so far.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Imagine using this for must-gather so we can actually get it on all CI runs.
When clusters die hard enough for must-gather to fail, I doubt ci-operator will be able to reliably guess the right window before to trigger the toppling-cluster must-gather. I think log-streaming and other continuous monitoring are probably the best we can deliver in those cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is outstanding -- if it's important that we signal. I'd prefer sending SIGTERM rather than messing with files.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggest:
4. When the test workflow enters the post-step phase, SIGTERM is sent to each observer process.
| 4. When it is time for collection, the existing pod (I think it's teardown container), writes a new data entry at `.data['teardown']` into every | ||
| `secret/<job>-kubeconfig`. | ||
| The value should be a timestamp. | ||
| 5. Ten minutes after `.data['teardown']` is written, the teardown container rsyncs a well known directory, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this necessary? ci-operator has oversight on the control flow and handles cancellation and failure gracefully, with timeouts and grace periods as necessary.
What do you need from the feature?
- A signal sent to the observer when the teardown phase of a job begins
- A (configurable) grace period, starting at that signal, after which artifacts are gathered
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- A (configurable) grace period, starting at that signal, after which artifacts are gathered
I wouldn't let someone configure it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why is this not just "ci-operator TERMs the monitor containers once the workflow wraps up entirely, or after the test steps complete (which would also work as an "I'll reap this cluster soon, you can stop watching" signal), or some such? And then uploads anything from the monitor container's mounted artifacts volume (just like the existing step logic).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggest:
5. Observer pods have their artifacts gathered after a grace period, or after the process exists, whichever comes first.
| are placed in some reasonable spot. | ||
|
|
||
| This could be optimized with a file write in the other direction, but the naive approach doesn't require it. | ||
| 6. All `pod/<job>-<e2e-observer-name>` are deleted. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not germane to this proposal
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not germane to this proposal
it's very germane to the example and concrete bits. I'm happy to not use this sketch if you wish to provide a different one, but for a concrete example of how this could be built, this is pretty straightforward
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This does not fit into the general model of lifecycle in the CI Operator, and it's not germane in that it does not matter to a functional and correct implementation of an observer Pod.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Outstanding, please remove.
|
I will clarify in the doc, but some answers here:
yes
no
I think no, but I'm open to it if you think it makes your life easier. |
| 2. Your pod must be able to tolerate a kubeconfig that doesn't work. The kubeconfig may point to a cluster than never comes up. | ||
| Or that hasn't come up yet. Your pod must not suck on it. | ||
| 3. If your pod fails, it will not fail the e2e job. If you need to fail an e2e job reliably, you need something else. | ||
| 4. Your pod must terminate when asked. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the most important requirement, which you mention above but should probably repeat here, is non-goal 2: that observer containers must not use the admin kubeconfig they receive to reach into the cluster under test and break things. This should be a read-only monitor, or at most, a "be very careful to only touch safe things" monitor.
The remaining conditions (termination, etc.) can be enforced by ci-operator, so monitors that don't handle them gracefully aren't that big a deal.
| ### Modify the openshift-tests command | ||
| This is easy for some developers, BUT as e2e monitor shows us, it doesn't provide enough information. | ||
| We need information from before the test command is run to find many of our problems. | ||
| Missing logs and missing intermediate resource states (intermediate operator status) are the most egregious so far. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The vanilla e2e suite already records intermediate operator status in Monitor cluster while tests execute. It does not seem like that big a reach to me to ask the installer to also monitor and log operator status, instead of its current "just collect this at the end if the install failed" behavior. And that would also be useful information to have in the log for installs that fail in the wild, which will not benefit from CI-specific monitoring suites.
Log preservation is unlikely to be covered by either the installer or the Monitor cluster while tests execute "test-case". I think that's a valid use-case for this enhancement. But being able to configure all clusters, CI and otherwise, to be able to stream logs to an external logging aggregator would be nice too, and would cover both CI and real-world use-cases.
| 2. allow a cluster observer to impact the running test. This is an observer author failure. | ||
| 3. provide ANY dimension specific data. If an observer needs this, they need to integrate differently. | ||
| 4. allow multiple instances of a single cluster observer to run against one CI cluster | ||
| 5. allow providing the cluster observer from a repo being built. The images are provided separately. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is trivial to allow, we can either remove it or move it to goals
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(like, the mechanism we use to ensure that the Pod can only start when the image is imported is identical to that which is used if we're waiting for it to be built)
| ## Proposal | ||
|
|
||
| The overall goal: | ||
| 1. have an e2e-observer process is expected to be running before a kubeconfig exists |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: s/is/that is/
|
|
||
| ### Changes to CI | ||
| This is a sketch of a possible path. | ||
| 1. Allow a non-dptp-developer to produces a pod template that mounts a `secret/<job>-kubeconfig` that will later contain a kubeconfig. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is important. s/pod template/step reference/g
| let's say `pod/<job>-resourcewatcher` and an empty `secret/<job>-kubeconfig`. | ||
| 3. As soon as a kubeconfig is available (these go into a known location in setup container today), write that | ||
| `kubeconfig` into every `secret/<job>-kubeconfig` (you probably want to label them). | ||
| 4. When it is time for collection, the existing pod (I think it's teardown container), writes a new data entry at `.data['teardown']` into every |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is outstanding -- if it's important that we signal. I'd prefer sending SIGTERM rather than messing with files.
| let's say `pod/<job>-resourcewatcher` and an empty `secret/<job>-kubeconfig`. | ||
| 3. As soon as a kubeconfig is available (these go into a known location in setup container today), write that | ||
| `kubeconfig` into every `secret/<job>-kubeconfig` (you probably want to label them). | ||
| 4. When it is time for collection, the existing pod (I think it's teardown container), writes a new data entry at `.data['teardown']` into every |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggest:
4. When the test workflow enters the post-step phase, SIGTERM is sent to each observer process.
| 4. When it is time for collection, the existing pod (I think it's teardown container), writes a new data entry at `.data['teardown']` into every | ||
| `secret/<job>-kubeconfig`. | ||
| The value should be a timestamp. | ||
| 5. Ten minutes after `.data['teardown']` is written, the teardown container rsyncs a well known directory, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggest:
5. Observer pods have their artifacts gathered after a grace period, or after the process exists, whichever comes first.
| are placed in some reasonable spot. | ||
|
|
||
| This could be optimized with a file write in the other direction, but the naive approach doesn't require it. | ||
| 6. All `pod/<job>-<e2e-observer-name>` are deleted. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Outstanding, please remove.
| 2. Before a kubeconfig is present, create an instance of each binary from #1 is created and an empty `secret/<job>-kubeconfig`. | ||
| 3. As soon as a kubeconfig is available (these go into a known location in setup container today), write that | ||
| `kubeconfig` into every `secret/<job>-kubeconfig` (you probably want to label them). | ||
| 4. When it is time for collection, the existing pod (I think it's teardown container), issues a sig-term to the process or perhaps |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we spoke about relaxing this to SIGTERM
stevekuznetsov
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
/approve
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: deads2k, stevekuznetsov The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
These pods will run alongside our existing CI e2e pods (this avoids risk to job state) and provide a way to contribute test, platform, configuration, and version independent observability to our CI runs.
The interaction with the existing e2e flow is designed to be minimal, to reduce risk and effort.
This (or something like it) will allow collecting data the TRT has determined is missing on upgrade and install failures. Most notably, logs for every pod that ever existed (loki) and recording for all intermediate state for operator status (resourcewatcher). Both tools have been built, but we've been unable to get them enabled in the right place for every test, platform, configuration, and version tuple.
/assign @stevekuznetsov
/cc @mfojtik @sttts @soltysh