Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Transfer of file doesn't detect corruption. #2584

Closed
buckleyGI opened this issue Mar 4, 2024 · 5 comments
Closed

Transfer of file doesn't detect corruption. #2584

buckleyGI opened this issue Mar 4, 2024 · 5 comments
Labels
enhancement New feature or request InProgress Marks an issue has being worked on

Comments

@buckleyGI
Copy link

buckleyGI commented Mar 4, 2024

k9s v 0.31.8
k8s v 1.26.6

To be clear, I understand that this isn't a problem with k9s but k9s masks it and k9s is here to make us succeed right :)

It turns out that the corruption of a sqlite Grafana database was not at the source but during transmission.
When you transfer a file from a pod to your system you typically use "kubectl cp". The corruption occurs with files approaching 100 Meg.
Never saw it before with smaller files but its a documented problem kubernetes/kubernetes#60140 (comment)
Solution: add a --retries 999
I hope this will save you 2 days as the k9s didn't give any feedback the corruptions was happening :)

The kubectl cp however does give back, for me it was "unexpected EOF".
When transferring large files I see it happening more often than less so it is very real and not a 1 off.

My ask is

  • to add the retries switch to the already existing dialog and give it a default value of 999
  • if possible give feedback of the result of kubectl cp. This maybe a bit harder as I think k9s does a fire and forget of the command (async without result)
@derailed derailed added the enhancement New feature or request label Mar 4, 2024
@derailed
Copy link
Owner

derailed commented Mar 4, 2024

@buckleyGI Good catch! Thank you for the research and heads up!

@derailed derailed added the InProgress Marks an issue has being worked on label Mar 4, 2024
derailed added a commit that referenced this issue Mar 5, 2024
derailed added a commit that referenced this issue Mar 5, 2024
@buckleyGI
Copy link
Author

buckleyGI commented Mar 5, 2024

Wow, thanks for considering this as an enhancement and actually fixing it a few hours later.
I was already on the fence of sponsoring you but now I am compelled :)

As for the --retries 999 switch, reading the PR (I don't know GO) you seem to

  1. have added the retries to the GUI as I thought but my guess is that the --retries switch doesn't reach the kubectl cp command?
    As the problem is intermittent I did quite a few tests and from the raw CL I get a 100% success rate if I do something like this
    kubectl cp grafana-deployment-8484768854-28vds:/var/lib/grafana/grafana.db /home/buckley/tmp/20240305112851.db113529 --retries 999
    And it also fails from time to time without the --retries.
    With the latest drop of k9s v0.32.1 arrived corrupt all of the times (which is unlucky). Are you sure --retries gets passed?

  2. reading the go code it seems that you catch the output from the kubectl cp command to present it later to the user. Again, go noob here.
    However, for the moment I see something flashing by that prevents me from interpreting it.
    Was the intention to alert the user if something went wrong? This is a nice to have as it's maybe an bigger code change as --retries 999 should effectively make the command reliable.

nobbs added a commit to nobbs/k9s that referenced this issue Mar 5, 2024
The newly introduced retries field in the transfer dialog was not
passed to the transfer command. This commit fixes that.

Follow-up of derailed#2584
@derailed
Copy link
Owner

derailed commented Mar 6, 2024

@buckleyGI Thank you! That's the exact pb. Got pulled off and thought I had finished this feature ;(
Will update on the next drop

@derailed derailed reopened this Mar 6, 2024
derailed added a commit that referenced this issue Mar 6, 2024
@nobbs
Copy link
Contributor

nobbs commented Mar 6, 2024

@derailed your latest fix does set the argument for --retries, but always to 999 regardless of what is provided as the user input.

image
image

I did fix it in #2596 but that probably went unnoticed :D

@buckleyGI
Copy link
Author

Thanks @derailed v0.32.3 does prevent the corruption with the (hardcoded?) retries of 999

Curious if there is a way that the end user can see the result of the command? Was there a way to catch the corruption before this feature in the TUI?

derailed pushed a commit that referenced this issue Mar 6, 2024
The newly introduced retries field in the transfer dialog was not
passed to the transfer command. This commit fixes that.

Follow-up of #2584
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request InProgress Marks an issue has being worked on
Projects
None yet
Development

No branches or pull requests

3 participants