-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cache deployer fails if the cluster signer is not set #4505
Comments
/area backend |
I wonder what would be the best way to deal with this issue. |
@Ark-kun, could this have to do with the cluster being Kubernetes 1.19 and its changes in regards to |
Maybe the problem is related to the version mismatch between kubectl version in the container and the Kubernetes server version.
|
@Ark-kun It doesn't seem like this issue has been resolved. I just deployed Kubeflow 1.2 on Kubernetes 1.19.4 and the cache-server and cache-deployer-deployment are still stuck with errors. I have spotted 2 Certificate Signing Requests, both identical with one in namespace
Cache-deployer-deployment logs:
|
I think the issue is caused by the fact that |
It would seem that it might also be because |
As one would expect, it was the fact that the |
/reopen |
@Bobgy: Reopened this issue. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
To support the webhook set up process stabler, we should seriously consider #4695 |
I would also suggest using cert-manager, as it seems the other applications are using that as well. Also, for my specific situation with Canonical's CDK, it is a manual multi-step process to copy the ca.key from the EasyRSA node to the master nodes due to the permissions on the file. |
…rver (kubeflow#4525) * Cache deployer - Using the same kubectl version as the server Fixes kubeflow#4505 * Changed the PATH precedence * Unquoted the jq output * Fixed the curl options
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
Hi @davidspek : I am also getting the same error:
And I have no way of setting the Is there an example of what the cert-manager approach entails? I'm trying to deploy kubeflow v1.3-branch with kustomize. |
Getting hit by this very same behaviour. Any workaround ? I'm still too noobish to hack the cert-manager and other resources... |
I highly recommend checking out v2 caching now, it does not depend on any privilege. https://www.kubeflow.org/docs/components/pipelines/caching-v2/ |
Hi @Bobgy , Thanks |
@cavepopo no worries. You'll need to either upgrade your existing install or make a new install.
|
Dunno whether this is solved yet. The problems might be in backend/src/cache/deployer/webhook-create-signed-cert.sh line 118, where a It might need to be replaced with
The generated apiVersion: certificates.k8s.io/v1
kind: CertificateSigningRequest
metadata:
name: ${csrName}
spec:
groups:
- system:authenticated
request: $(cat ${tmpdir}/server.csr | base64 | tr -d '\n')
signerName: kubernetes.io/kube-apiserver-client
usages:
- digital signature
- key encipherment
- client auth with suitable replacements in the |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
is this still an issue? I have EKS 1.26 and deploy from master branch 2.x and still get no CSR certificate and cache-deployer crash in loop |
Looks like it's not an issue anymore. I'll close it but feel free to reopen if the issue persists. /close |
@rimolive: Closing this issue. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
What steps did you take:
[A clear and concise description of what the bug is.]
When deploying kubeflow using kfctl_istio_dex.v1.1.0.yaml on a Charmed Kubernetes 1.19 cluster the cache-server and cache-deployer-deployment pods get stuck in PodInitializing and CrashLoopBackOff respectively. The cache-server pod shows the error
MountVolume.SetUp failed for volume "webhook-tls-certs" : secret "webhook-server-tls" not found
. Redploying either or both of the pods does not fix the issue. The cache-deployer-deployment pod gives the following logs:The cache-server.kubeflow csr is stuck in a Pending condition. However, manually running
kubectl certificate approve cache-server.kubeflow
does work.The following pull requests seem to be related:
openshift/oc#501
openshift/installer#3943
Environment:
Charmed Kubernetes 1.19 running on Ubuntu 20.04.1.
How did you deploy Kubeflow Pipelines (KFP)?
full Kubeflow deployment
/kind bug
/area backend
The text was updated successfully, but these errors were encountered: