-
Notifications
You must be signed in to change notification settings - Fork 40.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Dump GKE windows test logs via diagnostics tool #83517
Conversation
Hi @YangLu1031. Thanks for your PR. I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/cc @pjh |
@@ -229,31 +229,23 @@ function save-logs-windows() { | |||
echo "Not saving logs for ${node}, Windows log dumping requires gcloud support" | |||
return | |||
fi | |||
|
|||
export-windows-docker-event-log "${node}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't remember exactly what this function did but I guess it made the Docker event log available in a file somewhere. Do we still need to call this, or does the diagnostics command do the same thing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, we cannot call it, as it runs powershell cmd over SSH, and I left the function definition there in case we change back to SSH approach later. Diagnostics cmd collected all the event logs including docker event in evtx files, but not an explicit docker.log file. We could do the same in diagnostics cmd, if needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How can one read the evtx files? Is it possible to open them on a Linux workstation?
If we need to transform them somehow perhaps we should do this after unzipping the archive file at L246.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, there're some tools like python-evtx, evtViewer etc. But still not user-friendly, looks we need to improve diagnostics to dump plain text logs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok. If there's something easy to do then feel free to add it to this PR, otherwise we can follow-up with changes here or in the diagnostics tool itself.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tried using python-evtx tool on sample evtx file, looks like it converts evtx to xml, still not human-readable. evtViewer seems not working. Thinking to change it in the diagnostics tool.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, that sounds like probably the best approach. Consider filing a FR issue in https://github.com/GoogleCloudPlatform/compute-image-tools/ to track this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done thru PR: GoogleCloudPlatform/compute-image-tools#979
/ok-to-test |
@pjh Can you take another look? Thanks :) |
/retest |
Pretty much LGTM but we still have test failures due to the service account issue; discussed in person, waiting for an update on that. |
Waiting for some guidance on kubernetes/test-infra#14670 before deciding what to do here. |
@pjh Added a switch if it's GKE cluster, using diagnostics tool, otherwise using SSH to dump windows logs. Please take a look. |
@pjh please take another look :) |
@@ -238,6 +238,33 @@ function save-windows-logs-via-diagnostics-tool() { | |||
fi | |||
} | |||
|
|||
# Saves log files from SSH | |||
function save-windows-logs-via-ssh() { | |||
export-windows-docker-event-log "${node}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does node need to be passed as an arg?
Please do a test run with the updated code if possible to make sure there are no errors :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same thing for dest_dir
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, tested, it works fine. Looks like bash have dynamic scoping, the inner function have visibility to local variables of outer function.
But should pass as an arg for clarity and reliability. Will fix it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. Bash is the worst.
e1ceab6
to
efede24
Compare
/lgtm Please squash the commits into one. |
/priority important-soon |
/approve |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: YangLu1031, zmerlynn The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/test pull-kubernetes-dependencies managing the queue around the go1.13 bump, see https://groups.google.com/forum/#!topic/kubernetes-dev/Yyka3G2ebXE |
/retest |
What type of PR is this?
/kind cleanup
What this PR does / why we need it:
SSH on GKE windows node hasn't been enabled yet, so not able to get the archived logs after prow test. We have to use stackdriver to check logs which is way chunkier.
Use diagnostics tool dumping windows logs to mitigate the pain.
Special notes for your reviewer:
The diagnostics tool repo: https://github.com/GoogleCloudPlatform/compute-image-tools/tree/master/cli_tools/diagnostics
Does this PR introduce a user-facing change?: