Skip to content

Conversation

@deads2k
Copy link
Contributor

@deads2k deads2k commented Aug 7, 2020

These pods will run alongside our existing CI e2e pods (this avoids risk to job state) and provide a way to contribute test, platform, configuration, and version independent observability to our CI runs.

The interaction with the existing e2e flow is designed to be minimal, to reduce risk and effort.

This (or something like it) will allow collecting data the TRT has determined is missing on upgrade and install failures. Most notably, logs for every pod that ever existed (loki) and recording for all intermediate state for operator status (resourcewatcher). Both tools have been built, but we've been unable to get them enabled in the right place for every test, platform, configuration, and version tuple.

/assign @stevekuznetsov
/cc @mfojtik @sttts @soltysh

@openshift-ci-robot openshift-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Aug 7, 2020
Copy link

@stevekuznetsov stevekuznetsov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we expect some or any of the following:

  • multiple observers per cluster
  • multiple instances of one observer against a single test cluster
  • the container image running the observer pod to be build from the repository under test


### Changes to CI
This is a sketch of a possible path.
1. Allow a non-dptp-developer to produces a pod template that mounts a `secret/<job>-kubeconfig` that will later contain a kubeconfig.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does it need to be a Pod template? Our current interface for developers looks like a (container image, commands) tuple. Is that sufficient? What do we need users to be able to configure?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does it need to be a Pod template? Our current interface for developers looks like a (container image, commands) tuple. Is that sufficient? What do we need users to be able to configure?

I'm not sure. But I do know that podtemplates are standard kube structs, with their existing documentation and clear semantics. Resource requirements come to mind. Env vars probably.

I think it's actually easier for me to provide a podtemplate since my development is likely to be done using a pod.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Most if not all of those points of variance are already handled by existing types in the CI system. I don't want to expose e.g. CAP_SYS_ADMIN or host mount ...

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is important. s/pod template/step reference/g

### Changes to CI
This is a sketch of a possible path.
1. Allow a non-dptp-developer to produces a pod template that mounts a `secret/<job>-kubeconfig` that will later contain a kubeconfig.
2. When the CI pod starts today (the one with setup containers and stuff), also create an instance of each pod template,

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is too specific and not useful. It is not important for the parallel observer process to start at the time that the job starts.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the important requirement here that the process be running and ready to notice $KUBECONFIG being present when it's possible?

let's say `pod/<job>-resourcewatcher` and an empty `secret/<job>-kubeconfig`.
3. As soon as a kubeconfig is available (these go into a known location in setup container today), write that
`kubeconfig` into every `secret/<job>-kubeconfig` (you probably want to label them).
4. When it is time for collection, the existing pod (I think it's teardown container), writes a new data entry at `.data['teardown']` into every

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is important about this mechanism?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is important about this mechanism?

It's a simple notification that meets the min of, "easy to provide and easy to observe in an e2e-observer pod". I don't really how, but this way is easy.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do the monitors need to know the teardown time? Won't it be enough for them to record "cluster seems to have disappeared" and rely on test-suite failures to get folks to look at jobs where the cluster expoded before teardown.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do the monitors need to know the teardown time? Won't it be enough for them to record "cluster seems to have disappeared" and rely on test-suite failures to get folks to look at jobs where the cluster expoded before teardown.

collection can be expensive. You may want to defer until the end. Imagine using this for must-gather so we can actually get it on all CI runs. Amazingly, we have failed to achieve even that collection in a consistent way so far.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Imagine using this for must-gather so we can actually get it on all CI runs.

When clusters die hard enough for must-gather to fail, I doubt ci-operator will be able to reliably guess the right window before to trigger the toppling-cluster must-gather. I think log-streaming and other continuous monitoring are probably the best we can deliver in those cases.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is outstanding -- if it's important that we signal. I'd prefer sending SIGTERM rather than messing with files.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggest:

4. When the test workflow enters the post-step phase, SIGTERM is sent to each observer process.

4. When it is time for collection, the existing pod (I think it's teardown container), writes a new data entry at `.data['teardown']` into every
`secret/<job>-kubeconfig`.
The value should be a timestamp.
5. Ten minutes after `.data['teardown']` is written, the teardown container rsyncs a well known directory,

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this necessary? ci-operator has oversight on the control flow and handles cancellation and failure gracefully, with timeouts and grace periods as necessary.

What do you need from the feature?

  • A signal sent to the observer when the teardown phase of a job begins
  • A (configurable) grace period, starting at that signal, after which artifacts are gathered

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • A (configurable) grace period, starting at that signal, after which artifacts are gathered

I wouldn't let someone configure it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is this not just "ci-operator TERMs the monitor containers once the workflow wraps up entirely, or after the test steps complete (which would also work as an "I'll reap this cluster soon, you can stop watching" signal), or some such? And then uploads anything from the monitor container's mounted artifacts volume (just like the existing step logic).

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggest:

5. Observer pods have their artifacts gathered after a grace period, or after the process exists, whichever comes first.

are placed in some reasonable spot.

This could be optimized with a file write in the other direction, but the naive approach doesn't require it.
6. All `pod/<job>-<e2e-observer-name>` are deleted.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not germane to this proposal

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not germane to this proposal

it's very germane to the example and concrete bits. I'm happy to not use this sketch if you wish to provide a different one, but for a concrete example of how this could be built, this is pretty straightforward

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This does not fit into the general model of lifecycle in the CI Operator, and it's not germane in that it does not matter to a functional and correct implementation of an observer Pod.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Outstanding, please remove.

@deads2k
Copy link
Contributor Author

deads2k commented Aug 7, 2020

I will clarify in the doc, but some answers here:

  • multiple observers per cluster

yes

  • multiple instances of one observer against a single test cluster

no

  • the container image running the observer pod to be build from the repository under test

I think no, but I'm open to it if you think it makes your life easier.

2. Your pod must be able to tolerate a kubeconfig that doesn't work. The kubeconfig may point to a cluster than never comes up.
Or that hasn't come up yet. Your pod must not suck on it.
3. If your pod fails, it will not fail the e2e job. If you need to fail an e2e job reliably, you need something else.
4. Your pod must terminate when asked.
Copy link
Member

@wking wking Aug 7, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the most important requirement, which you mention above but should probably repeat here, is non-goal 2: that observer containers must not use the admin kubeconfig they receive to reach into the cluster under test and break things. This should be a read-only monitor, or at most, a "be very careful to only touch safe things" monitor.

The remaining conditions (termination, etc.) can be enforced by ci-operator, so monitors that don't handle them gracefully aren't that big a deal.

### Modify the openshift-tests command
This is easy for some developers, BUT as e2e monitor shows us, it doesn't provide enough information.
We need information from before the test command is run to find many of our problems.
Missing logs and missing intermediate resource states (intermediate operator status) are the most egregious so far.
Copy link
Member

@wking wking Aug 7, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The vanilla e2e suite already records intermediate operator status in Monitor cluster while tests execute. It does not seem like that big a reach to me to ask the installer to also monitor and log operator status, instead of its current "just collect this at the end if the install failed" behavior. And that would also be useful information to have in the log for installs that fail in the wild, which will not benefit from CI-specific monitoring suites.

Log preservation is unlikely to be covered by either the installer or the Monitor cluster while tests execute "test-case". I think that's a valid use-case for this enhancement. But being able to configure all clusters, CI and otherwise, to be able to stream logs to an external logging aggregator would be nice too, and would cover both CI and real-world use-cases.

2. allow a cluster observer to impact the running test. This is an observer author failure.
3. provide ANY dimension specific data. If an observer needs this, they need to integrate differently.
4. allow multiple instances of a single cluster observer to run against one CI cluster
5. allow providing the cluster observer from a repo being built. The images are provided separately.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is trivial to allow, we can either remove it or move it to goals

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(like, the mechanism we use to ensure that the Pod can only start when the image is imported is identical to that which is used if we're waiting for it to be built)

## Proposal

The overall goal:
1. have an e2e-observer process is expected to be running before a kubeconfig exists

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: s/is/that is/


### Changes to CI
This is a sketch of a possible path.
1. Allow a non-dptp-developer to produces a pod template that mounts a `secret/<job>-kubeconfig` that will later contain a kubeconfig.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is important. s/pod template/step reference/g

let's say `pod/<job>-resourcewatcher` and an empty `secret/<job>-kubeconfig`.
3. As soon as a kubeconfig is available (these go into a known location in setup container today), write that
`kubeconfig` into every `secret/<job>-kubeconfig` (you probably want to label them).
4. When it is time for collection, the existing pod (I think it's teardown container), writes a new data entry at `.data['teardown']` into every

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is outstanding -- if it's important that we signal. I'd prefer sending SIGTERM rather than messing with files.

let's say `pod/<job>-resourcewatcher` and an empty `secret/<job>-kubeconfig`.
3. As soon as a kubeconfig is available (these go into a known location in setup container today), write that
`kubeconfig` into every `secret/<job>-kubeconfig` (you probably want to label them).
4. When it is time for collection, the existing pod (I think it's teardown container), writes a new data entry at `.data['teardown']` into every

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggest:

4. When the test workflow enters the post-step phase, SIGTERM is sent to each observer process.

4. When it is time for collection, the existing pod (I think it's teardown container), writes a new data entry at `.data['teardown']` into every
`secret/<job>-kubeconfig`.
The value should be a timestamp.
5. Ten minutes after `.data['teardown']` is written, the teardown container rsyncs a well known directory,

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggest:

5. Observer pods have their artifacts gathered after a grace period, or after the process exists, whichever comes first.

are placed in some reasonable spot.

This could be optimized with a file write in the other direction, but the naive approach doesn't require it.
6. All `pod/<job>-<e2e-observer-name>` are deleted.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Outstanding, please remove.

2. Before a kubeconfig is present, create an instance of each binary from #1 is created and an empty `secret/<job>-kubeconfig`.
3. As soon as a kubeconfig is available (these go into a known location in setup container today), write that
`kubeconfig` into every `secret/<job>-kubeconfig` (you probably want to label them).
4. When it is time for collection, the existing pod (I think it's teardown container), issues a sig-term to the process or perhaps

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we spoke about relaxing this to SIGTERM

Copy link

@stevekuznetsov stevekuznetsov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm
/approve

@openshift-ci-robot openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Oct 28, 2020
@openshift-ci-robot
Copy link

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: deads2k, stevekuznetsov

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:
  • OWNERS [deads2k,stevekuznetsov]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-merge-robot openshift-merge-robot merged commit 81fd2ea into openshift:master Oct 28, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants