-
Notifications
You must be signed in to change notification settings - Fork 804
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
502 Bad Gateway in Hub after upgrading chart (0.9.0-alpha.1 -> 0.9.0-n4xx) #1863
Comments
Hi @meneal, this isn't sufficient information for me to conclude something, so you need to do some legwork.
If this doesn't work
|
@consideRatio Thank you for the response!
No message on this unfortunately.
The pods all get into the running state.
We use ingress as seen in the attached yaml file. We also use auth from github enterprise.
Provided in this gist: https://gist.github.com/meneal/a8b8c21cd87dbb1ef7fd3e04a39db041 |
You've got an ingress configuration, but you've set |
Oh, no! This is quite embarrassing! 😄 I was trying a bunch of different things and failed to switch this back. Fixing this makes 0.9.1 work, but then when I try to apply the upgrade to: |
@consideRatio I was certainly crossing my fingers, but the netpol change suggested did not fix the 502s. |
I'm still wondering about the issues with the user-scheduler, since that is the only place I'm actually finding any kind of signal that anything is wrong. In addition to the error mentioned above I'm seeing this suggestion in the user-server logs: I don't really want to do this unless it is expected that this role and rolebinding exist. FWIW the role itself doesn't exist in our cluster. I hope I'm not barking up the wrong tree with this, but I'm just having trouble looking for diagnostics other than the 502 itself. |
I'm 100% that this warning is not related. If this was related, the symptom would have been that the user pod fail to go from Pending to Running. @meneal perhaps you have time for a debugging session over video chat with me right now? If so, it would be nice to do this before i cut 0.10.0-beta.1 Here is a video link, I'll hang around here hoping to catch you https://meet.google.com/wns-pfcf-sqm =) |
Ignoring the workaround for previous development releases, I'd like to know clearly if It is a bit problematic that you have pinned the versions of the images, because you can get out of sync in a way that is hard to understand. For example, you have So, the question comes down to, does it still fail with For debugging purposes, I want those to be the defaults because otherwise I need to overview all changes in all sections at once to rule out they cause the observed issue. |
My debugging process would be to verify the status of all pods, then try to observe where the traffic stops in the 502 error, is it between the ingress controller and the proxy pod for example? When that is confirmed, I'd debug if I can access the proxy pod from inside the jupyterhub namespace, then I'd inspect if i can access the proxy pod from the namespace where the ingress controller pod reside. I'd do things like kubectl run -it local-busybox --image=busybox -- wget http://proxy-public.jupyterhubnamespacename.svc.cluster.local
kubectl run -it remote-busybox -n myothernamespace --image=busybox -- wget http://proxy-public.jupyterhubnamespacename.svc.cluster.local |
Hmmm I also think this is a bit fishy... You set the ingress to accept traffic on a path /hub, but jupyterhub isn't configured to run under a path with hub.baseUrl=/hub which it should be if the incoming traffic arrive to the JupyterHub by mydomain.com/hub Note though, that with this config, you will have mydomain.com/hub/hub/home and mydomain.com/hub/user/erik/ etc. I'll assume your issue is a configuration mistake rather than a bug now and go ahead and cut a beta release. |
Hi there @meneal 👋! I closed this issue because it was labelled as a support question. Please help us organize discussion by posting this on the http://discourse.jupyter.org/ forum. Our goal is to sustain a positive experience for both users and developers. We use GitHub issues for specific discussions related to changing a repository's content, and let the forum be where we can more generally help and inspire each other. Thanks you for being an active member of our community! ❤️ |
@consideRatio thank you so much for your help on this! Removing this line made everything work as expected. I have to say that I really appreciate this community! |
I appreciate your encouraging feedback @meneal, I really appreciate your positive spirit :) |
Bug description
Our chart has not been upgraded in quite some time and we need the features that were added for handling
imagePullSecrets
recently, so I tried upgrading and ran into an odd situation. We were all the way back on0.9.0-alpha.1.060.6698eb9
and upgraded to0.9.0-n409.hce116620
.Expected behaviour
When the chart upgrade was completed I expected that I would be able to visit the URL of jupyterhub and login through GitHub and get a newly spawned pod.
Actual behaviour
When visiting the jupyterhub URL I get an error from nginx saying "502 Bad Gateway".
Diagnostic
When looking through pod logs, I don't see anything awry except for in the
user-scheduler
pod where I've noticed the following failure:This might be completely unrelated, but I'm not finding much else.
How to reproduce
values.yaml
filehelm upgrade ${HELM_RELEASE_NAME} https://jupyterhub.github.io/helm-chart/jupyterhub-0.9.0-n409.hce116620.tgz -f values-jupyterhub.yaml --install --set proxy.secretToken=${PROXY_TOKEN} --namespace ${K8S_NAMESPACE}
Your personal set up
1.18.8_1527
,version.BuildInfo{Version:"v3.3.4", GitCommit:"a61ce5633af99708171414353ed49547cf05013d", GitTreeState:"clean", GoVersion:"go1.14.9"}
0.9.0-n409.hce116620
values.yaml
that may be of assistance here as well as any of the image versions.Please let me know if this is inappropriate as a bug. Thanks!
The text was updated successfully, but these errors were encountered: