-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
individual diagnostic parameters #16589
individual diagnostic parameters #16589
Conversation
Seems odd that I'm in pkg/diagnostics/OWNERS but not pkg/oc/admin/diagnostics/OWNERS |
/unassign |
@smarterclayton just wanted your feedback on the UI for this:
A bit awkward to have to put the diagnostic name in twice but I wanted to be able to distinguish between "run this diagnostic with a parameter set" and "run all the diagnostics, with a parameter set on one." |
@deads2k it's a big PR, but maybe a summary will be enough to tell me if I'm doing it all wrong. Diagnostic initialization for diagnostics that take non-flag parameters have a two-step process. As before, the diagnostic struct is created with the flag contents and clients and such. Then non-flag parameters are set via the diagnostic's SetParameters() method which has an opportunity to complete the diagnostic struct or report an error. Then it runs as before. This seemed like the best solution for generic parameter logic allowing individual implementation of the actual parameters. Thoughts? |
/assign fabianofranz @openshift/cli-review If I'm reading this correctly, this is establishing a separate way to pass parameters to subcommands. |
@fabianofranz I think the ui proposed by this PR is the most approachable (from a user standpoint) for now, considering the use-case the command is trying to solve. Would still like to wait for your feedback / thoughts on this |
|
@fabianofranz bump. Planning to demo this tomorrow, would be nice if I had some feedback on the direction. |
@fabianofranz poke again. |
Specifying diagnostics name for every param is bit ugly (seems a lot of duplication) and passing many params is painful. I think passing diagnostic checks along with their tunable options on CLI is not scalable. Another approach could be:
This is like giving the user a basic template for pick and choose if they don't want to run all diagnostic checks or want to specify optional parameters.
|
Thanks, @pravisankar, that's a pretty good idea. Passing a lot of params on the command line would definitely be painful. On the other hand, for the more likely use case of modifying just one or two parameters (certainly the goal is to be useful with no parameters), having to know about and generate and edit this config file is also relatively painful. So, would like to hear what others prefer. |
@sosiouxme Trying one more approach:
Allowing only these 2 patterns will remove the ambiguity for the params. User may need to run multiple diag commands in case of custom options but I think that's a good compromise considering the simplicity and scalability of this pattern. |
I could see it ending up that way. It would certainly be more consistent with how everything else works. I'm a bit reluctant though to give up the ability to run all the diagnostics at once with just a tweak or two, and without having to keep up with the expanding list of names. My dilemma is I want it to be trivial for the casual user and also completely customizable for production use. I was thinking of combining your config file idea with this CLI implementation. So when the user gets to where they have four or five parameters to specify and that's getting tedious, they can transition to generating and using a config file. |
@sosiouxme open a separate PR adding yourself as a reviewer and I'll approve. |
At first sight I'd say we need to convert every diagnostic name into a subcommand and make every possible parameter in a diagnostic a proper flag of that subcommand, like suggested by @pravisankar. You can still run the top-level cmd to run all diagnostics and subcommands can inherit generic flags from the top-level one, and although very verbose, you can always have a one-liner that runs all of them once, like Although a possible solution, the concern I have about having a config file is that it's not the representation of any existing resource, right? For example, it's not the representation of a given resource in JSON or YAML format, the same way used to interact with the API, or accepted by But I totally understand the "run all diagnostics tweaking just a couple" use case. Let me think a bit if I can find another solution. |
@sosiouxme How many diagnostic names and parameter names we have? Do we expect them to change often? I ask because, on the other hand, turning them into subcommands and flags automatically make them "tied" to the CLI and subject to the deprecation policy when changes are needed. |
@fabianofranz |
@fabianofranz FWIW I was planning on moving / removing a number of them and deprecating them here. But after that, yes, will be mostly additive, with the occasional retiring of any that become irrelevant. |
@fabianofranz any further thoughts? Need to move forward in some direction for the next check I'm writing :) |
820a237
to
256a8c6
Compare
Sorry about the delay, I researched a few other CLI patterns but didn't find anything that addresses this use case nicely enough. I'd suggest that we move on with turning diagnostics into subcommands and parameters into flags. It fits better with what users already know and will allow us to grow consistently. Add a One idea came to mind though: have a |
@fabianofranz thanks, I think that makes sense. Here's a thought for how to manage the "all" cases.
|
256a8c6
to
c7cf6c8
Compare
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: sosiouxme Assign the PR to them by writing The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these OWNERS Files:
You can indicate your approval by writing |
@fabianofranz @juanvallejo One further thought. Currently you could do:
... to run multiple named diagnostics and no others. Making them subcommands will break that - do we have any obligation to address that somehow? |
c7cf6c8
to
c92f5df
Compare
If --cluster-context is specified and the context is present, use it as the cluster-admin. The logic was backward and this gave an error before.
c92f5df
to
d869499
Compare
Adds the ability to specify parameters for individual diagnostics on the command line (without proliferating flags). Addresses openshift#14640
d869499
to
4050666
Compare
@sosiouxme: The following tests failed, say
Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
Starting fresh with #17773 |
Automatic merge from submit-queue. diagnostics: individual parameters Updated version of #16589 based on feedback. This addresses #14640 by making individual diagnostics into subcommands that can have their own flags. Existing top-level flags for `NetworkCheck` are removed and the config envvar for `EtcdWriteVolume` is deprecated in favor of a flag. All individual flags are available underneath the `all` subcommand. This required rather more refactoring as the flags had to be known in order to define the command, not just at runtime. Usages are given below: ``` $ oc adm diagnostics --help This utility helps troubleshoot and diagnose known problems for an OpenShift cluster and/or local host. The base command runs a standard set of diagnostics: oc adm diagnostics [...] An individual diagnostic may be run as a subcommand which may have flags for specifying options specific to that diagnostic. Finally, the "all" subcommand runs all available diagnostics (including heavyweight ones skipped in the standard set) and provides all individual diagnostic flags. Usage: oc adm diagnostics [options] Available Commands: aggregatedlogging Check aggregated logging integration for proper configuration all Diagnose common cluster problems [...] unitstatus Check status for related systemd units Options: --cluster-context='': Client context to use for cluster administrator --config='': Path to the config file to use for CLI requests. --context='': The name of the kubeconfig context to use -l, --diaglevel=1: Level of diagnostic output: 4: Error, 3: Warn, 2: Notice, 1: Info, 0: Debug --host=false: If true, look for systemd and journald units even without master/node config --loglevel=0: Set the level of log output (0-10) --logspec='': Set per module logging with file|pattern=LEVEL,... --master-config='': Path to master config file (implies --host) --node-config='': Path to node config file (implies --host) --prevent-modification=false: If true, may be set to prevent diagnostics making any changes via the API ``` (Note `all` is now intermingled with the individual subcommands.) ``` $ oc adm diagnostics all --help This utility helps troubleshoot and diagnose known problems for an OpenShift cluster and/or local host. This subcommand exists to run all available diagnostics: oc adm diagnostics all Available diagnostics vary based on client config and local OpenShift host config. All flags from the base command work similarly here, but all possible flags for individual diagnostics are also available. Usage: oc adm diagnostics all [options] Options: --cluster-context='': Client context to use for cluster administrator --config='': Path to the config file to use for CLI requests. --context='': The name of the kubeconfig context to use -l, --diaglevel=1: Level of diagnostic output: 4: Error, 3: Warn, 2: Notice, 1: Info, 0: Debug --diagnosticpod-images='openshift/origin-${component}:${version}': Image template to use in creating a pod --diagnosticpod-latest-images=false: If true, when expanding the image template, use latest version, not release version --etcdwritevolume-duration='1m': How long to perform the write test --host=false: If true, look for systemd and journald units even without master/node config --loglevel=0: Set the level of log output (0-10) --logspec='': Set per module logging with file|pattern=LEVEL,... --master-config='': Path to master config file (implies --host) --networkcheck-logdir='/tmp/openshift/': Path to store diagnostic results in case of errors --networkcheck-pod-image='openshift/origin:v3.9.0-alpha.0': Image to use for diagnostic pod --networkcheck-test-pod-image='openshift/origin-deployer:v3.9.0-alpha.0': Image to use for diagnostic test pod --networkcheck-test-pod-port=8080: Serving port on the diagnostic test pod --networkcheck-test-pod-protocol='TCP': Protocol used to connect to diagnostic test pod --node-config='': Path to node config file (implies --host) --prevent-modification=false: If true, may be set to prevent diagnostics making any changes via the API ``` ``` $ oc adm diagnostics EtcdWriteVolume --help Runs the EtcdWriteVolume diagnostic. Check the volume of writes against etcd over a time period and classify them by operation and key Aliases: etcdwritevolume, EtcdWriteVolume Usage: oc adm diagnostics etcdwritevolume [options] Options: -l, --diaglevel=1: Level of diagnostic output: 4: Error, 3: Warn, 2: Notice, 1: Info, 0: Debug --duration='1m': How long to perform the write test --host=false: If true, look for systemd and journald units even without master/node config --loglevel=0: Set the level of log output (0-10) --logspec='': Set per module logging with file|pattern=LEVEL,... --master-config='': Path to master config file (implies --host) --node-config='': Path to node config file (implies --host) ``` ``` $ oc adm diagnostics NetworkCheck --help Runs the NetworkCheck diagnostic. Create a pod on all schedulable nodes and run network diagnostics from the application standpoint Aliases: networkcheck, NetworkCheck Usage: oc adm diagnostics networkcheck [options] Options: --cluster-context='': Client context to use for cluster administrator --config='': Path to the config file to use for CLI requests. --context='': The name of the kubeconfig context to use -l, --diaglevel=1: Level of diagnostic output: 4: Error, 3: Warn, 2: Notice, 1: Info, 0: Debug --logdir='/tmp/openshift/': Path to store diagnostic results in case of errors --loglevel=0: Set the level of log output (0-10) --logspec='': Set per module logging with file|pattern=LEVEL,... --pod-image='openshift/origin:v3.9.0-alpha.0': Image to use for diagnostic pod --prevent-modification=false: If true, may be set to prevent diagnostics making any changes via the API --test-pod-image='openshift/origin-deployer:v3.9.0-alpha.0': Image to use for diagnostic test pod --test-pod-port=8080: Serving port on the diagnostic test pod --test-pod-protocol='TCP': Protocol used to connect to diagnostic test pod ```
This addresses #14640 by giving individual diagnostics the ability to receive parameters without defining any more flags (need this anyway for the app-create diagnostic I'm working on). It also deprecates the existing flags for NetworkCheck as well as the config envvar for EtcdWriteVolume in favor of the new parameter scheme.