Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

x509: certificate relies on legacy Common Name field #1395

Closed
tenzen-y opened this issue Nov 18, 2020 · 13 comments
Closed

x509: certificate relies on legacy Common Name field #1395

tenzen-y opened this issue Nov 18, 2020 · 13 comments
Labels

Comments

@tenzen-y
Copy link
Member

/kind bug

What steps did you take and what happened:
[A clear and concise description of what the bug is.]

  1. Download release 0.10.0.
  2. I installed Katib using the following commands.
    bash scripts/v1beta1/deploy.sh
  3. I have started the sample experiment by running the following commands.
$ kubectl apply -f example/v1beta1/tpe-example.yaml
Error from server (InternalError): error when creating "/Users/tenzen/katib-0.10.0/examples/v1beta1/tpe-example.yaml": Internal error occurred: failed calling webhook "mutating.experiment.katib.kubeflow.org": Post "https://katib-controller.kubeflow.svc:443/mutate-experiments?timeout=30s": x509: certificate relies on legacy Common Name field, use SANs or temporarily enable Common Name matching with GODEBUG=x509ignoreCN=0

What did you expect to happen:
I expected the experiment to start.

Anything else you would like to add:
[Miscellaneous information that will assist in solving the issue.]

I found related topics in the release note for 1.19.0
https://kubernetes.io/docs/setup/release/notes/

Kubernetes is now built with golang 1.15.0-rc.1.
The deprecated, legacy behavior of treating the CommonName field on X.509 serving certificates as a host name when no Subject Alternative Names are present is now disabled by default. It can be temporarily re-enabled by adding the value x509ignoreCN=0 to the GODEBUG environment variable. (#93264, @justaugustus) [SIG API Machinery, Auth, CLI, Cloud Provider, Cluster Lifecycle, Instrumentation, Network, Node, Release, Scalability, Storage and Testing]

Environment:

  • Kubeflow version (kfctl version): None
  • Minikube version (minikube version): None
  • Kubernetes version: (use kubectl version): 1.19.4
  • OS (e.g. from /etc/os-release): ubuntu16.04(4.15.0-72-generic)
@andreyvelich
Copy link
Member

Hi @tenzen-y, thank you for creating this.

I think this problem happens because K8s 1.19 is using Go 1.15.
Which cluster are you using: GCP, AWS, on-prem ?
Also check this: golang/go#39568 (comment).

/cc @gaocegege @johnugeorge

@tenzen-y
Copy link
Member Author

@andreyvelich

Which cluster are you using: GCP, AWS, on-prem ?

I'm using on-prem.

I think this problem happens because K8s 1.19 is using Go 1.15.
Also check this: golang/go#39568 (comment).

Does that mean that Katib is using some certificates that does not contain SANs?

Thanks.

@imilos
Copy link

imilos commented Nov 22, 2020

Same problem here. I'm using on-prem MicroK8s 1.19 probably built using Go 1.15. Any workaround?

Thx in advance!

@zuiurs
Copy link

zuiurs commented Nov 26, 2020

Same problem.

This may be due to an older version of the vendored controller-runtime library.

As workaround, I create a Secret called katib-controller with a self-signed certificate embedded (it has fields of ca-cert.pem, ca-key.pem, cert.pem, and key.pem) in kubeflow namespace. The certificate generation process by Webhook Bootstrap uses an existing Secret, if one exists.

@andreyvelich
Copy link
Member

andreyvelich commented Nov 26, 2020

Sorry for the long reply @tenzen-y.

Thank you @zuiurs, it makes sense. We use 0.1.9 controller-runtime version, which is pretty old. For the default bootstrapping, certificate for the webhook server is located in katib-controller secret in kubeflow namespace, like you create.

Although, you can use your local file system to generate cert, check here.

In the meantime, we are working on removing auto generation for the webhooks from the Katib controller, check this: kubeflow/manifests#1379. cc @knkski

  • Kubebuilder 2.0 doesn't support dynamic webhooks.
  • It will solve problems when our users want to use Katib without webhooks.

@tenzen-y
Copy link
Member Author

@zuiurs
Thank you for the workaround suggestions!
We succeeded in starting the experiment using the Secret created using the certificates generated by cfssl.

@andreyvelich
Now I understand the cause of the error. And thank you for working on a solution to the problem.

For those who have encountered the same problem, here are the steps I took.

  1. Create the certificates by cfssl. (Specify katib-controller.$NAMESPACE.svc in hosts.)
  2. Remove Secret named katib-controller.
  3. Create Secret using the following commands.
$ kubectl create secret -n $NAMESPACE generic katib-controller --from-file=ca-cert.pem --from-file=ca-key.pem --from-file=cert.pem --from-file=key.pem
  1. Remove the katib-controller pod to mount the created Secret.

@shantanuVerma7
Copy link

Thanks, @tenzen-y @zuiurs and @andreyvelich
I was stuck for a long time at this and creating a self-signed certificate using cfssl helped me too.
Just to save time, here is the quick process of creating certificates using cfssl:

  1. Download the cfssl executables, save them to /usr/local/bin & make them executable
  2. We will need 4 *.pem files viz CA certificate, CA private key and Server/Host Certificate and Private key.
  3. Create a CA signing request. Use cfssl print-defaults csr > ca-csr.json, edit the required fields in the json file (Common Name etc.)
    Generate the CA Key and cert file cfssl gencert -initca ca-csr.json | cfssljson -bare ca – (will give ca.pem & ca-key.pem)
  4. Create ca-config.json with cfssl print-defaults config > ca-config.json which has signing and profile details.
  5. Create a file for generating SSL certificates, which typically has the format
    { "CN": "katib-controller.$NAMESPACE.svc", "hosts": [ "katib-controller.$NAMESPACE.svc " ], "key": { "algo": "rsa", "size": 2048 }, "names": [ { "C": "US", "L": "CA", "ST": "San Francisco" } ] }
    Now create server SSL certificates using
    cfssl gencert \ -ca=ca.pem \ -ca-key=ca-key.pem \ -config=ca-config.json \ -profile=web-servers \ server-csr.json | cfssljson -bare server
    (will give server.pem and server-key.pem)

Just rename ca.pem -> ca-cert.pem, server.pem -> cert.pem & server-key.pem -> key.pem

Finally delete the existing katib controller secret, create a new secret using
$ kubectl create secret -n $NAMESPACE generic katib-controller --from-file=ca-cert.pem --from-file=ca-key.pem --from-file=cert.pem --from-file=key.pem
Delete the pod so that new secret will be mounted.

@rushins
Copy link

rushins commented Jan 21, 2021

any ideas how to fix you mentioned create certificates ,i get the same issue today on rancher 2.5 with kubernetes 1.19.6

@rushins
Copy link

rushins commented Jan 21, 2021

@shantanuVerma7 : I tried your steps creating 1-5 steps but get error at step 5 as below

open -ca=ca.pem: no such file or directory
Failed to parse input: unexpected end of JSON input

any ideas, i see the file exist in my current folder.

@trog-levrai
Copy link

I'm still getting the same issue. I tried the following (for kubeflow namespace):

# Generating CA
openssl genrsa -out ca-key.pem 2048
openssl req -x509 -new -nodes -key ca-key.pem -sha256 -days 1024 -out ca-cert.pem

# Generating client certificate
openssl genrsa -out key.pem 2048
openssl req -new -sha256 -key key.pem -subj "/C=US/ST=CA/O=MyOrg, Inc./CN=katib-controller.kubeflow.svc" -out csr.pem
openssl x509 -req -in csr.pem -CA ca-cert.pem -CAkey ca-key.pem -CAcreateserial -out cert.pem -days 500 -sha256

# Updateing katib-controller secret
kubectl delete secret -n kubeflow katib-controller
kubectl create secret -n kubeflow generic katib-controller --from-file=ca-cert.pem --from-file=ca-key.pem --from-file=cert.pem --from-file=key.pem

# Check the katib-controller pod name and then
kubectl delete pod katib-controller-7fcc95676b-2nskf -n kubeflow

After that I simply get a connection refused...

@andreyvelich
Copy link
Member

Hi @trog-levrai.

After that I simply get a connection refused...

Where did you get it ?

@trog-levrai
Copy link

trog-levrai commented Feb 24, 2021

Alright I figured things out for my case. If you're trying to use openssl then you have to find a way to properly support the SAN field in the certificate. cfssl makes it easy but the example above wasn't working for me with cfssl with version 1.5.0.

So here's what worked for me with the kubeflow namespace (inspired from kelseyhightower/kubernetes-the-hard-way#457 (comment)).

Write the following files:

  • ca-csr.json
{
    "CN": "katib-controller.kubeflow.svc",
    "hosts": [
        "katib-controller.kubeflow.svc"
    ],
    "key": {
        "algo": "ecdsa",
        "size": 256
    },
    "names": [
        {
            "C": "US",
            "ST": "CA",
            "L": "San Francisco"
        }
    ]
}
  • server-csr.json (note there is no blankspace in the hosts value)
{
    "CN":"katib-controller.kubeflow.svc",
    "hosts":[
        "katib-controller.kubeflow.svc"
    ],
    "key":{
        "algo":"rsa",
        "size":2048
    },
    "names":[
        {
            "C":"US",
            "L":"CA",
            "ST":"San Francisco"
        }
    ]
}
  • ca-config.json
{
    "signing": {
        "default": {
            "expiry": "168h"
        },
        "profiles": {
            "www": {
                "expiry": "8760h",
                "usages": [
                    "signing",
                    "key encipherment",
                    "server auth"
                ]
            },
            "client": {
                "expiry": "8760h",
                "usages": [
                    "signing",
                    "key encipherment",
                    "client auth"
                ]
            },
            "kubernetes": {
                "expiry": "876000h",
                "usages": [
                    "signing",
                    "key encipherment",
                    "server auth",
                    "client auth"
                    ]
            }
        }
    }
}

From there you can run (note the kubernetes profile used):

# Generating SSL certs
cfssl gencert -initca ca-csr.json | cfssljson -bare ca –
cfssl gencert -ca=ca.pem -ca-key=ca-key.pem -config=ca-config.json -profile=kubernetes server-csr.json | cfssljson -bare server

# Needed renaming
mv ca.pem ca-cert.pem
mv server.pem cert.pem
mv server-key.pem key.pem

# K8S secret refresh
kubectl delete secret -n kubeflow katib-controller
kubectl create secret -n kubeflow generic katib-controller --from-file=ca-cert.pem --from-file=ca-key.pem --from-file=cert.pem --from-file=key.pem

# Check the katib-controller pod name and then
kubectl delete pod katib-controller-POD_ID -n kubeflow

@viksharma1987
Copy link

viksharma1987 commented Apr 28, 2021

Alright I figured things out for my case. If you're trying to use openssl then you have to find a way to properly support the SAN field in the certificate. cfssl makes it easy but the example above wasn't working for me with cfssl with version 1.5.0.

So here's what worked for me with the kubeflow namespace (inspired from kelseyhightower/kubernetes-the-hard-way#457 (comment)).

Write the following files:

  • ca-csr.json
{
    "CN": "katib-controller.kubeflow.svc",
    "hosts": [
        "katib-controller.kubeflow.svc"
    ],
    "key": {
        "algo": "ecdsa",
        "size": 256
    },
    "names": [
        {
            "C": "US",
            "ST": "CA",
            "L": "San Francisco"
        }
    ]
}
  • server-csr.json (note there is no blankspace in the hosts value)
{
    "CN":"katib-controller.kubeflow.svc",
    "hosts":[
        "katib-controller.kubeflow.svc"
    ],
    "key":{
        "algo":"rsa",
        "size":2048
    },
    "names":[
        {
            "C":"US",
            "L":"CA",
            "ST":"San Francisco"
        }
    ]
}
  • ca-config.json
{
    "signing": {
        "default": {
            "expiry": "168h"
        },
        "profiles": {
            "www": {
                "expiry": "8760h",
                "usages": [
                    "signing",
                    "key encipherment",
                    "server auth"
                ]
            },
            "client": {
                "expiry": "8760h",
                "usages": [
                    "signing",
                    "key encipherment",
                    "client auth"
                ]
            },
            "kubernetes": {
                "expiry": "876000h",
                "usages": [
                    "signing",
                    "key encipherment",
                    "server auth",
                    "client auth"
                    ]
            }
        }
    }
}

From there you can run (note the kubernetes profile used):

# Generating SSL certs
cfssl gencert -initca ca-csr.json | cfssljson -bare ca –
cfssl gencert -ca=ca.pem -ca-key=ca-key.pem -config=ca-config.json -profile=kubernetes server-csr.json | cfssljson -bare server

# Needed renaming
mv ca.pem ca-cert.pem
mv server.pem cert.pem
mv server-key.pem key.pem

# K8S secret refresh
kubectl delete secret -n kubeflow katib-controller
kubectl create secret -n kubeflow generic katib-controller --from-file=ca-cert.pem --from-file=ca-key.pem --from-file=cert.pem --from-file=key.pem

# Check the katib-controller pod name and then
kubectl delete pod katib-controller-POD_ID -n kubeflow

Hi @trog-levrai trog-levrai Using your suggestion I was able to generate the certificates and was able to place it as secrets, all the steps you suggested worked like a charm, but post that I started getting connection refused issue, details below

"details":{"causes":[{"message":"failed calling webhook \"mutating.experiment.katib.kubeflow.org\": Post \"https://katib-controller.kubeflow.svc:443/mutate-experiments?timeout=30s\": dial tcp <someip>:443: connect: connection refused"}]},"code":500

Can you help, How to fix that now ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

9 participants