release-22.2: roachtest: set default cluster settings, report timeout failures correctly, roachprod: capture ssh logs#96903
Merged
smg260 merged 5 commits intocockroachdb:release-22.2from Mar 8, 2023
Conversation
|
Thanks for opening a backport. Please check the backport criteria before merging:
If some of the basic criteria cannot be satisfied, ensure that the exceptional criteria are satisfied within.
Add a brief release justification to the body of your PR to justify this backport. Some other things to consider:
|
Member
When a cluster is started with the `--skip-init` option, the caller can run `roachprod init` at any time to initialize the cluster. Unfortunately, the code used to initialize the cluster was duplicated: one copy existed in the `start` path, and another in the `init` path. Since the latter is used far less frequently, it had a bug that went unnoticed: it hardcoded the first node index as `0`, when node indices start at 1. This commit fixes the issue by updating the constant and sharing code between `init `and `start`. Fixes cockroachdb#88226. Release note: None
In cockroachdb#88514, the cluster start logic was refactored to reuse the same code across `init` and `start`, fixing a bug in the former. However, the refactoring overlooked the fact that we previously always set the default cluster settings when there's more than one node in the cluster. This fixes that by setting the default cluster settings in that case; one particularly important cluster setting is the license key, necessary for some roachtests. Fixes cockroachdb#88660. Fixes cockroachdb#88665. Fixes cockroachdb#88666. Fixes cockroachdb#88710. Release note: None
Currently we do not capture SSH logs in the event of a command failure, which can be useful in debugging issues, transient or otherwise. This commit enables logging via the ssh switch -vvv and specifying a log filename, to be stored under an ssh/ directory in the test log root. The debug file is deleted upon successful (zero) exit of the command, but preserved for non-zero exits for further inspection. Additionally, - The name of the log is consistent with the corresponding run log and encodes a node number and timestamp. - SSH sessions must now be initialised with the command itself to re-inforce its single use nature. - Debug friendly command names can optionally be specified to influence the name of the run/ssh logs. - Retry options can optionally be omitted from any call to ParallelE to disable retries Release note: None Epic: CRDB-21386
Wait commands are issued every 500ms returning a non zero exit code until nodes have started. This results in a large number of ssh debug logs during cluster creation. Also adopts functional options. Release note: None Epic: none
Timeout failures are recorded at actual timeout, with subsequent failures secondary. `addFailure` accepts a depth parameter and no longer includes context cancellation, which is done separately. Epic: none Fixes: cockroachdb#91237 Release note: None
e621749 to
54871f4
Compare
herkolategan
approved these changes
Mar 8, 2023
Collaborator
herkolategan
left a comment
There was a problem hiding this comment.
Reviewed 2 of 2 files at r1.
Reviewable status:complete! 0 of 0 LGTMs obtained
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Backport:
roachprod init." (roachprod: fixroachprod init. #88514)Please see individual PRs for details.
/cc @cockroachdb/release
Epic: None
Release note: None
Release justification: test only change