proxy_protocol mode breaks HTTP01 challenge Check stage #466

bbetter173 · 2018-04-13T12:29:49Z

Is this a BUG REPORT or FEATURE REQUEST?:

/kind bug

What happened:

When running ingress-nginx with use-proxy-protocol: true, the check stage of cert-manager fails as it (appears to) communicate with the ingress controller using plain HTTP requests.

What you expected to happen:

How to reproduce it (as minimally and precisely as possible):

Deploy ingress-nginx
Configure an upstream load balancer that supports proxy protocol and enable it.
Set ConfigMap option use-proxy-protocol: true, and proxy-real-ip-cidr: x.x.x.x (Use the real load balancer IP) for the nginx controller
Deploy cert-manager
Request a certificate using HTTP01 confirmation.

Anything else we need to know?:

Environment:

Kubernetes version: Client Version: version.Info{Major:"1", Minor:"10", GitVersion:"v1.10.0", GitCommit:"fc32d2f3698e36b93322a3465f63a14e9f0eaead", GitTreeState:"clean", BuildDate:"2018-03-26T16:55:54Z", GoVersion:"go1.9.3", Compiler:"gc", Platform:"linux/amd64"} Server Version: version.Info{Major:"1", Minor:"10", GitVersion:"v1.10.0", GitCommit:"fc32d2f3698e36b93322a3465f63a14e9f0eaead", GitTreeState:"clean", BuildDate:"2018-03-26T16:44:10Z", GoVersion:"go1.9.3", Compiler:"gc", Platform:"linux/amd64"}
Cloud provider or hardware configuration**: 1 master, 3 nodes, vSphere
Install tools:
Log files:

nginx-ingress-controller:

2018/04/13 12:27:55 [error] 1837#1837: *10321 broken header: "GET /.well-known/acme-challenge/9oQ5DbRUHNpnIsqvlvFUcb-km2OgpckyaXXEQh9cQQk HTTP/1.1
Host: therealhost.example.com
User-Agent: Go-http-client/1.1
Accept-Encoding: gzip

" while reading PROXY protocol, client: 10.129.2.0, server: 0.0.0.0:80

cert-manager:
E0413 12:25:57.580259 1 controller.go:196] certificates controller: Re-queuing item "kube-system/therealhost.example.com" due to error processing: error waiting for key to be available for domain "therealhost.example.com": context deadline exceeded

The text was updated successfully, but these errors were encountered:

bbetter173 · 2018-04-13T12:31:41Z

If the purpose of the check process is to check external availability of the acme challenge, perhaps making the request to the external ingress IP would be the best idea?

munnerz · 2018-04-13T21:21:52Z

I'm unsure what the bug is here - according the ACME spec, cert-manager should always communicate with a challenge endpoint using HTTP. HTTPS is specifically not supported and should not be used here.

cert-manager already attempts to make a request to the external address for the ingress controller (i.e. by making a request to the same IP that Letsencrypt will use when performing a challenge validation).

Can you describe your DNS/ingress further? It sounds like you have some kind of split horizon DNS set up that may be tripping up cert-manager?

whereisaaron · 2018-04-13T22:12:51Z

Make sure the DNS record for the domain you are getting a certificate for points to the external IP for the load balancer.
Try nginx-ingress 0.12.x which had a feature to never redirect ACME requests to HTTPS.

bbetter173 · 2018-04-13T23:30:24Z

The challenge is being requested over HTTP, but the nginx-ingress controller is expecting requests to be made using the proxy protocol - which my load balancer is configured to do.

When the go-http client makes a request directly to the nginx-ingress controller (i.e. not using the load balancers external IP) the proxy_protocol isn't used, causing it to fail,

If I manually run curl -v 'http://therealhost.example.com/.well-known/acme-challenge/9oQ5DbRUHNpnIsqvlvFUcb-km2OgpckyaXXEQh9cQQk the response works fine, as the initial HTTP request is terminated by the load balancer, and the backend request to nginx-ingress is made using the proxy_protocol.

I am currently using the quay.io/kubernetes-ingress-controller/nginx-ingress-controller:0.12.0 controller, but this is not related to HTTPS, or redirection in any manner.

bbetter173 · 2018-04-13T23:39:21Z

@munnerz - There is no split horizon DNS, the only issue is that the nginx-ingress controller is expecting requests using the proxy_protocol method, and the cert-manager controller is making requests to it using plain HTTP.

I would suggest adding a flag that toggles cert-manager to make all requests to the nginx-ingress controller via the load balancer instead of hitting it directly.

bbetter173 · 2018-04-14T05:50:11Z

More troubleshooting indicates there is some strange behaviour in my cluster, using a similar setup in minikube works fine.

The error message I'm seeing when I set logging to verbose is: I0414 05:41:23.649175 1 http.go:410] ACME HTTP01 self check failed for domain "therealhost.example.com", waiting 5s: Get http://therealhost.example.com/.well-known/acme-challenge/9oQ5DbRUHNpnIsqvlvFUcb-km2OgpckyaXXEQh9cQQk: EOF

From within the cert-manager instance, the DNS host resolves correctly:

kubectl exec -it --namespace=shared-services cert-manager-cert-manager-5c7bfd7dc4-kpffn sh
/ # nslookup therealhost.example.com
nslookup: can't resolve '(null)': Name does not resolve

Name:      therealhost.example.com
Address 1: 123.123.123.123

But if I install curl within the cert-manager instance I get odd behaviour:

/ # curl 'http://therealhost.example.com/.well-known/acme-challenge/9oQ5DbRUHNpnIsqvlvFUcb-km2OgpckyaXXEQh9cQQk'  -v
*   Trying 103.75.202.143...
* TCP_NODELAY set
* Connected to therealhost.example.com (123.123.123.123) port 80 (#0)
> GET /.well-known/acme-challenge/9oQ5DbRUHNpnIsqvlvFUcb-km2OgpckyaXXEQh9cQQk HTTP/1.1
> Host: therealhost.example.com
> User-Agent: curl/7.59.0
> Accept: */*
>
* Empty reply from server
* Connection #0 to host therealhost.example.com left intact
curl: (52) Empty reply from server

Whereas if I run the same command from any other host I get this:

curl 'http://therealhost.example.com/.well-known/acme-challenge/9oQ5DbRUHNpnIsqvlvFUcb-km2OgpckyaXXEQh9cQQk'  -v
* About to connect() to therealhost.example.com port 80 (#0)
*   Trying 103.75.202.143...
* Connected to therealhost.example.com (123.123.123.123) port 80 (#0)
> GET /.well-known/acme-challenge/9oQ5DbRUHNpnIsqvlvFUcb-km2OgpckyaXXEQh9cQQk HTTP/1.1
> User-Agent: curl/7.29.0
> Host: therealhost.example.com
> Accept: */*
>
< HTTP/1.1 200 OK
< Server: nginx/1.13.9
< Date: Sat, 14 Apr 2018 05:49:08 GMT
< Content-Type: text/plain; charset=utf-8
< Content-Length: 87
< Connection: keep-alive
<
* Connection #0 to host therealhost.example.com left intact
9oQ5DbRUHNpnIsqvlvFUcb-km2OgpckyaXXEQh9cQQk.zrZMCbhC-Lh8qDFGdhEMA4BvJkeBPAzkThQKl4U_sOE

metallhopf · 2019-03-23T13:45:39Z

@HeWhoWas i ran into the same issue today, did you find a solution?

from within cert-manager pod, DNS lookups resolve the public loadbalancer IP.

but it still seems that cert-manager will contact the node-ip directly to solve the acme challenge, hence does not go through the public loadbalancer

bbetter173 · 2019-03-24T11:25:34Z

@metallhopf Our solution was to use DNS-01 validation, bypassing these issues altogether. I believe at the time this ticket was opened, that feature was only just being introduced and wasn't an option in a release version.

Sorry I can't be more help, but for what is worth we've found DNS validation to be simpler and more robust than HTTP.

whereisaaron · 2019-03-24T17:34:23Z

I would suggest adding a flag that toggles cert-manager to make all requests to the nginx-ingress controller via the load balancer instead of hitting it directly.

That what CM always tries to do. It hits the domain name of the certificate using HTTP. It expects that domain name will resolve to the external IP. And that's the point, to test the challenge as e.g. Let's Encrypt would, from the outside. There is no value or point in CM making extra steps to find out and use the internal node IP instead, that wouldn't be checking anything useful.

If the traffic from the CM or CM challenge Pods are not going external, probably you have some special cluster DNS, network DNS, or perhaps hairpin routing getting in the way.

metallhopf · 2019-03-24T18:39:11Z

If the traffic from the CM or CM challenge Pods are not going external, probably you have some special cluster DNS, network DNS, or perhaps hairpin routing getting in the way.

you are correct, i was able to confirm this. the public loadbalancer ip is resolved correctly for the http request.

but traffic is routed internally directly from the cert-manager pod to nginx-ingress-controller (does not go through the loadbalancer).

cluster is hosted on digitalocean, will edit this post if i find a solution.

altoning · 2019-03-26T00:34:18Z

If the traffic from the CM or CM challenge Pods are not going external, probably you have some special cluster DNS, network DNS, or perhaps hairpin routing getting in the way.

you are correct, i was able to confirm this. the public loadbalancer ip is resolved correctly for the http request.

but traffic is routed internally directly from the cert-manager pod to nginx-ingress-controller (does not go through the loadbalancer).

cluster is hosted on digitalocean, will edit this post if i find a solution.

got the same issue with proxy protocol at digitalocean. my only workaround is to temporarily disable proxy protocol on the load balancer (and nginx ingress config map) allowing the certificate to be issued.

i'm hoping for a better solution to avoid interruption in service during certificate renewal.

johnl · 2019-04-10T16:07:47Z

I get the same on a cluster at Brightbox, which also uses hairpinning by default and an external load balancer that sends PROXY protocol. The hairpinning causes other issues too admittedly, but it's obviously a common setting for Kubernetes clusters.

If the traffic from the CM or CM challenge Pods are not going external, probably you have some special cluster DNS, network DNS, or perhaps hairpin routing getting in the way.

Since it's fairly common (or at least, so I believe), perhaps an option just to disable testing the challenge? The challenge will work externally, from Let's Encrypt, just not internally.

Routhinator · 2019-04-13T16:02:27Z

Just hit this with a DO managed Kube cluster and a DO LB in proxy-protocol mode. Seems like the the nginx-ingress LB is broken with DO's proxy protocol:

kubernetes/ingress-nginx#3996

From the Ingress

 " while reading PROXY protocol, client: 10.244.0.1, server: 0.0.0.0:80
2019/04/13 15:28:02 [error] 1384#1384: *248667 broken header: "GET /.well-known/acme-challenge/oF_X6SITHseBK1hdpEiZKbVCkmjvIiTHlPO46XXsSJM HTTP/1.1
Host: example.com
User-Agent: Go-http-client/1.1
Accept-Encoding: gzip
Connection: close

Unfortunately the DNS01 challenge is broken for DigitalOcean in 0.7.0 (and based on my testing in 0.6.0 as well) so HTTP01 is a must for DO.

bjg2 · 2019-04-22T18:08:48Z

I faced this one as well, why is this one closed? Cert manager http01 challenge is not working with DO load balancer with proxy protocol at the moment, which is the only load balancer that makes sense (as it forwards request IP). I guess I will try to use DNS01 until resolved, but how HTTP01 should be working, can we open this one?

altoning · 2019-04-22T18:48:34Z

I faced this one as well, why is this one closed? Cert manager http01 challenge is not working with DO load balancer with proxy protocol at the moment, which is the only load balancer that makes sense (as it forwards request IP). I guess I will try to use DNS01 until resolved, but how HTTP01 should be working, can we open this one?

Unfortunately, DigitalOcean doesn't support DNS01. Maybe unless you're using DigitalOcean's DNS service - TBD.

bjg2 · 2019-04-23T08:09:18Z

@altoning Not sure what are trying to say, I managed to get digitalocean dns01 challenge working with cert manager 0.7.0 and proxy protocol load balancer. That being said, this one still needs to get attention, as this is just a temp solution, I don't wanna have digital ocean API key in k8s, it is tied to specific digital ocean user and it stops working when the user is removed from the project (that is by design and should be so, I just don't want to tie cert manager validation to that).

dottodot · 2019-05-02T06:46:34Z

@HeWhoWas Please can you reopen this issue as it hasn't been resolved.

johnl · 2019-05-02T17:15:41Z

I believe this is due to a design flaw in Kubernetes. When using LoadBalancer services that use an external service (like Brightbox or DO mentioned here), kube-proxy intercepts the outgoing requests to the load balancer external IP at the network level to keep them within the Kubernetes cluster but doesn't understand that some LoadBalancers can do more than just standard TCP balancing. So this will break internal connections to external load balancers that do more, such as proxy-support or even SSL offloading.

We've now fixed this at Brightbox by not telling kube-proxy about the external IP addresses of the LoadBalancers, so it doesn't intercept them. I think DO are going to fix it the same way, and AWS have done this all along. See kubernetes/kubernetes#66607 for more details.

So this isn't a cert-manager problem.

bbetter173 · 2019-05-02T21:34:08Z

Reopening due to all the conversation and at @dottodot request. If this turns out to not be a cert manager issue, happy to have it closed.

sorenmat · 2019-05-06T19:04:57Z

Not sure this will help anyone, but I managed to fix this :)
I removed the proxy-protocol from the nginx configmap. You can still put it on individual ingress resources though.

whereisaaron · 2019-05-07T22:17:31Z

I think @johnl's reference to kubernetes/kubernetes#66607 is the cause. By luck or design, it works on AWS because the AWS k8s cloud provider code only adds the external host name not the external IP address.

The good news is a patch to Kubernetes in kubernetes/kubernetes#77523 should eventually fix this for everyone.

I don't think it is a cert-manager problem, but it worth keeping this open to track the fix and warn other cert-manager users of this issue.

Philio · 2019-07-18T10:20:52Z

@sorenmat is it possible you could share how you configure it on a per ingress basis?

sorenmat · 2019-07-18T10:40:36Z

sure @Philio

  annotations:
    kubernetes.io/ingress.class: "nginx"
    use-proxy-protocol: "true"

Philio · 2019-07-18T11:56:56Z

@sorenmat Thanks for the quick reply, unfortunately though it didn't work, no proxy protocol configuration at all was added to nginx.conf. Looking at the template it is using only global configuration (although I may have interpreted it wrong) for proxy protocol. Perhaps it's version specific, which version did you get it to work with?

MichaelOrtho · 2019-07-19T03:39:10Z

Same issue here. I need proxy-protocol because of client IPs so it is not a solution to disable it. If it is on, cert-manager is not working because of that pre-check.

There are two solutions:

updating cert-manager code to retry test with proxy-protocol header if simple get failed or
allow cert-manager config to disable pre-check

I would prefer solution No.1 because there is a reason we are checking that endpoint before asking LetsEncrypt to do the same. This check is valuable as it prevents quota issues.

This is important to be resolved as people will need proxy protocol and it is bad if cert-manager is unable to work in that case. I think any of two solutions would let us go around this problem.

I see that pre-check code is at:
https://github.com/jetstack/cert-manager/blob/70bc3e845bffac5acc10934911648a42a3a05ed1/pkg/issuer/acme/http/http.go#L184

With curl, I was able to connect with this line. It returned body.

http_proxy=http://my.website.com:80 curl -v http://my.website.com/.well-known/acme-challenge/mwbIQcwaB9LL6wwIGRjQuRfL8cl5lFfGXocuQ3Y_fqs --haproxy-protocol

I see a recent commit 099abed and PR #1850 by @kinolaev that can have connection to what we are talking about.

Is this change adding PROXY header to checks or just using proxy?

kinolaev · 2019-07-19T07:00:38Z

Hello @MichaelOrtho,
yes, my PR can help by allowing to send self-check request to acme-solver service to bypass ingress controller.
But in this case it’s looks like temporary solution, actual problem (as already mentioned) discussed here kubernetes/kubernetes#77523: we need send traffic that targeting balancer but not yet passed through balancer to balancer. Difficult is to differ traffic from balancer and traffic from pods and nodes because we have many different types of balancers, for example, simple bgp/arp-based (like metallb)and feature-reach that can proxy-protocol and ssl-termination.

ximon18 · 2020-05-14T12:33:50Z

Hmm, my apologies, I appear to have posted in the wrong repository. I see now that I am NOT in the official certbot repo, for some reason I thought I was.

bots-business · 2020-05-14T12:36:02Z

so need reopen this issue?

bukowa · 2020-07-29T13:47:38Z

looks like kubernetes/enhancements#1392 is merged for 1.19

compumike · 2020-10-23T21:03:52Z

Hi all, I ran into the same issue. I've just published hairpin-proxy which works around the issue, specifically for cert-manager self-checks. https://github.com/compumike/hairpin-proxy

It uses CoreDNS rewriting to intercept traffic that would be heading toward the external load balancer. It then adds a PROXY line to requests originating from within the cluster. This allows cert-manager's self-check to pass.

It's able to do this all through DNS rewriting and spinning up a tiny HAProxy, so there's no need to wait for either kubernetes or cert-manager to fix this issue in their packages.

xavier-rodet · 2020-11-23T14:48:44Z

Thank you very much @compumike! It works great :)

kskalski · 2020-11-27T15:34:46Z

In my setup nginx-controller is the public facing loadbalancer, there isn't any other (why would I need it) and when it is configured with "use-proxy-protocol: true", then acme self-checks in cert-manager fail. Necessity to disable use-proxy-protocol every time I expect cert-manager to create my certs or re-new them is quite inconvenient and destroyed the purpose of the tool.

Looks like in this setup (I also don't use kube-proxy, why would I, if I can configure all proxying in nginx) there is no recommended solution that would make it work?

KeksBeskvitovich · 2021-03-22T20:07:11Z

Hi all. Same issue, in DigitalOcean k8s and proxy-protocol.
Resolved, when i set annotation service.beta.kubernetes.io/do-loadbalancer-hostname in ingress controller.
https://github.com/digitalocean/digitalocean-cloud-controller-manager/blob/master/docs/controllers/services/annotations.md#servicebetakubernetesiodo-loadbalancer-hostname
After this, certificates issuing work perfectly with HTTP01 challenge!

Stone624 · 2021-04-04T09:25:20Z

@compumike This was the final step on 72 hour debugging journey to get my docker-compose app https accessible from Kubernetes ! THANK YOU !!!

davidpestana · 2021-04-07T15:23:05Z

Hi all, I ran into the same issue. I've just published hairpin-proxy which works around the issue, specifically for cert-manager self-checks. https://github.com/compumike/hairpin-proxy

It uses CoreDNS rewriting to intercept traffic that would be heading toward the external load balancer. It then adds a PROXY line to requests originating from within the cluster. This allows cert-manager's self-check to pass.

It's able to do this all through DNS rewriting and spinning up a tiny HAProxy, so there's no need to wait for either kubernetes or cert-manager to fix this issue in their packages.

thak you Very Much... its really is only solved the problem since ours searching solutions

Miniland1333 · 2021-04-30T22:44:20Z

Just wanted to let everyone know that this is still something that does pop up with DigtalOcean loadbalancers with the proxy protocol. It looks like there is some upstream activity to hopefully fix the root cause for this: kubernetes/kubernetes#66607

namelessvoid · 2021-05-11T09:51:13Z

Note: Questions were resolved, please see edits below, thank you!

@Jyrno42 Thanks for providing the patched cert-manager images! The fix for the proxy protocol works well with my ingress-nginx in proxy protocol mode 👍

However, I found an issue you (or someone else :D) maybe can help me with. The pod hosting the HTTP01 challenge does not start since the docker image tag is not properly populated:

$ kubectl describe pod cm-acme-http-solver-hls7k
Name:         cm-acme-http-solver-hls7k
Namespace:    staging
Priority:     0
Node:         xxxxx
Start Time:   Tue, 11 May 2021 11:20:53 +0200
Labels:       acme.cert-manager.io/http-domain=xxxx
              acme.cert-manager.io/http-token=xxxx
              acme.cert-manager.io/http01-solver=true
Annotations:  cni.projectcalico.org/podIP: xxxx
              cni.projectcalico.org/podIPs: xxxx
              sidecar.istio.io/inject: false
Status:       Pending
IP:           xxxx
IPs:
  IP:           xxxx
Controlled By:  Challenge/backend-dashboard-tls-dvkd4-3440324594-2368510026
Containers:
  acmesolver:
    Container ID:
    Image:         quay.io/jetstack/cert-manager-acmesolver:{STABLE_DOCKER_TAG}
    Image ID:
...

Manually editing the tag to v1.3.1 fixes the issue but is a very inconvenient resolution :)

Do you know how to fix this? Maybe the auto-build needs some updates?

I'm using v1.3.0 of your patched images (e.g. jyrno42/cert-manager-controller:v1.3.0). Another question: Is there also a way to provide a patched version of cert-manager 1.3.1?

Thank you so much for your help and the effort you already put into fixing the problem at hand! 🎉

EDIT: So the problem with the {STABLE_DOCKER_TAG} somehow resolved by re-deploying cert-manager 🤷
EDIT 2: Jyrno42 fixed the build, so 1.3.1 is now also available as patched version. Thanks again!

namelessvoid · 2021-05-17T12:41:24Z

Follow up to my post above:

Looks like my issue with the acme http challenge pod having an invalid docker tag of {STABLE_DOCKER_TAG} was not resolved. Maybe it is related to the way the patched cert-manager images are built by Jyrno42's pipeline.

However, there is an easy workaround by specifying the acme http challenge image via command line argument: --acme-http01-solver-image=jyrno42/cert-manager-acmesolver:v1.3.1 (note: I think the original jetstack acmesolver should work, too).

If you use helm, you can supply this via helm values like this:

image:
  repository: jyrno42/cert-manager-controller
  tag: v1.3.1

extraArgs:
  - "--acme-http01-solver-image=jyrno42/cert-manager-acmesolver:v1.3.1"

MFrancesco · 2021-07-23T13:17:30Z

Hi all, I ran into the same issue. I've just published hairpin-proxy which works around the issue, specifically for cert-manager self-checks. https://github.com/compumike/hairpin-proxy

It uses CoreDNS rewriting to intercept traffic that would be heading toward the external load balancer. It then adds a PROXY line to requests originating from within the cluster. This allows cert-manager's self-check to pass.

It's able to do this all through DNS rewriting and spinning up a tiny HAProxy, so there's no need to wait for either kubernetes or cert-manager to fix this issue in their packages.

This works and fix the issue on scaleway too

sureshkachwa · 2021-10-19T08:06:58Z

HTTP01 challenge for wild card certificate works, but is it recommended to use HTTP01 challenge for validating the challenges?
In HTTP01 challenge, we don't even prove CA's (Let's encrypt) that we own xxxxx.com

wadexu007 · 2022-01-05T06:06:59Z

@Jyrno42 Hello, your latest image v1.5.0-beta.0 has some problem when container start

standard_init_linux.go:211: exec user process caused "no such file or directory"

wadexu007 · 2022-01-06T01:52:09Z

Hi all, I ran into the same issue. I've just published hairpin-proxy which works around the issue, specifically for cert-manager self-checks. https://github.com/compumike/hairpin-proxy
It uses CoreDNS rewriting to intercept traffic that would be heading toward the external load balancer. It then adds a PROXY line to requests originating from within the cluster. This allows cert-manager's self-check to pass.
It's able to do this all through DNS rewriting and spinning up a tiny HAProxy, so there's no need to wait for either kubernetes or cert-manager to fix this issue in their packages.

This works and fix the issue on scaleway too

it works for me also, thanks

kquinsland · 2022-04-27T17:51:46Z

I am yet another "me too" for this issue on Digital Ocean... even with the latest 1.22 release:

but maybe this will be solved in 1.25

❯ kubectl version
<...>
Server Version: version.Info{Major:"1", Minor:"22", GitVersion:"v1.22.8", GitCommit:"7061dbbf75f9f82e8ab21f9be7e8ffcaae8e0d44", GitTreeState:"clean", BuildDate:"2022-03-16T14:04:34Z", GoVersion:"go1.16.15", Compiler:"gc", Platform:"linux/amd64"}

My symptoms are the same as many of you:

I0427 15:56:48.000286 1 ingress.go:99] cert-manager/challenges/http01/selfCheck/http01/ensureIngress "msg"="found one existing HTTP01 solver ingress" "dnsName"="my-domain.tld" "related_resource_kind"="Ingress" "related_resource_name"="cm-acme-http-solver-sz4bv" "related_resource_namespace"="default" "related_resource_version"="v1" "resource_kind"="Challenge" "resource_name"="serviceNameHere-jxk97-648906699-2238484802" "resource_namespace"="default" "resource_version"="v1" "type"="HTTP-01"
E0427 15:56:48.011217 1 sync.go:186] cert-manager/challenges "msg"="propagation check failed" "error"="failed to perform self check GET request 'http://my-domain.tld/.well-known/acme-challenge/BxoPaSKOqSedr3HvG47jQGKNOxFJBTNDTUfoeNnzjEg': Get \"http://my-domain.tld/.well-known/acme-challenge/BxoPaSKOqSedr3HvG47jQGKNOxFJBTNDTUfoeNnzjEg\": EOF" "dnsName"="my-domain.tld" "resource_kind"="Challenge" "resource_name"="serviceNameHere-jxk97-648906699-2238484802" "resource_namespace"="default" "resource_version"="v1" "type"="HTTP-01"

But when fetching the URL externally, things work:

❯ curl -vvv http:///my-domain.tld/.well-known/acme-challenge/BxoPaSKOqSedr3HvG47jQGKNOxFJBTNDTUfoeNnzjEg
<...>
> GET /.well-known/acme-challenge/BxoPaSKOqSedr3HvG47jQGKNOxFJBTNDTUfoeNnzjEg HTTP/1.1
> Host: /my-domain.tld
> User-Agent: curl/7.82.0
> Accept: */*
> 
* Mark bundle as not supporting multiuse
< HTTP/1.1 200 OK
< Date: Wed, 27 Apr 2022 15:45:27 GMT
< Content-Type: text/plain; charset=utf-8
< Content-Length: 87
< Connection: keep-alive
< Cache-Control: no-cache, no-store, must-revalidate
< 
* Connection #0 to host /my-domain.tld left intact
BxoPaSKOqSedr3HvG47jQGKNOxFJBTNDTUfoeNnzjEg.sw-Xjc-9d_IujG1ZEJT7Ol--ZOFyfKlHemo2LYZhDn0%

I was able to move past this by:

Adding the service.beta.kubernetes.io/do-loadbalancer-hostname annotation to the Service created by the nginx-ingress manifest
re-apply the manifest
delete the cert kubectl delete certificates ServiceNameHere-cert
A few seconds later, cmctl status certificate guestbook-cert showed the cert as issued.

c4tz · 2022-10-26T12:41:34Z

Just for people coming here from Google (like me):

@KeksBeskvitovich's solution also works for Hetzner Cloud by setting these annotations :

load-balancer.hetzner.cloud/uses-proxyprotocol: "true"
load-balancer.hetzner.cloud/hostname: lb-subdomain.example.com

and this value:

controller:
  config:
    use-proxy-protocol: true

for ingress-nginx. Afterwards, you will have both the source IP and a working cert-manager!

aemakeye · 2023-01-12T15:09:16Z

what if i have multiple domains pointing to the same LB address?

Stringls · 2023-08-30T13:40:22Z

Any updates on it?
I'm having the same problem

hegerdes · 2023-11-20T19:38:36Z

Just for people coming here from Google (like me):

@KeksBeskvitovich's solution also works for Hetzner Cloud by setting these annotations :
load-balancer.hetzner.cloud/uses-proxyprotocol: "true"
load-balancer.hetzner.cloud/hostname: lb-subdomain.example.com

Thanks, also facing this issue on hetzner. Fix seems to work but I have multiple domains pointing to that LB. It is also not a scalabe solution since the ingress needs to know the domain (all domains) before routes are applied and it is reasanoble to assume that these resources are not handled by the same person. Ingress routes may even be applied by CI.

Is there any other way to stop the kube-proxy to alter the DNS resolution so the external IP is used?

abhijith-b · 2023-12-08T18:13:33Z

Just for people coming here from Google (like me):
@KeksBeskvitovich's solution also works for Hetzner Cloud by setting these annotations :
load-balancer.hetzner.cloud/uses-proxyprotocol: "true"
load-balancer.hetzner.cloud/hostname: lb-subdomain.example.com
Thanks, also facing this issue on hetzner. Fix seems to work but I have multiple domains pointing to that LB. It is also not a scalabe solution since the ingress needs to know the domain (all domains) before routes are applied and it is reasanoble to assume that these resources are not handled by the same person. Ingress routes may even be applied by CI.

Is there any other way to stop the kube-proxy to alter the DNS resolution so the external IP is used?

https://github.com/compumike/hairpin-proxy this worked for me.

frankforpresident · 2024-03-28T09:17:11Z

Hi all, I ran into the same issue. I've just published hairpin-proxy which works around the issue, specifically for cert-manager self-checks. https://github.com/compumike/hairpin-proxy

It uses CoreDNS rewriting to intercept traffic that would be heading toward the external load balancer. It then adds a PROXY line to requests originating from within the cluster. This allows cert-manager's self-check to pass.

It's able to do this all through DNS rewriting and spinning up a tiny HAProxy, so there's no need to wait for either kubernetes or cert-manager to fix this issue in their packages.

This fixed it for me, I did need to change the target server to nginx-ingress-ingress-nginx-controller.default.svc.cluster.local but it works like a charm,

@compumike
Thank you very much!

jetstack-bot added the kind/bug Categorizes issue or PR as related to a bug. label Apr 13, 2018

bbetter173 closed this as completed Apr 14, 2018

bbetter173 reopened this May 2, 2019

whereisaaron mentioned this issue May 7, 2019

Why kube-proxy add external-lb's address to node local iptables rule? kubernetes/kubernetes#66607

Closed

meyskens mentioned this issue Oct 8, 2020

Propagation check failed #3341

Closed

chrissound mentioned this issue Nov 26, 2020

Waiting for http-01 challenge propagation: failed to perform self check GET request #3238

Closed

zuzzas mentioned this issue Feb 10, 2022

cert-manager does not work with Ingress Nginx's proxy-protocol mode deckhouse/deckhouse#841

Closed

2 tasks

kquinsland mentioned this issue Apr 27, 2022

Failed to perform self check GET request #4941

Closed

mysticaltech mentioned this issue Oct 19, 2022

Issue with Certificates HTTP-01 LE challenge with ingress-nginx kube-hetzner/terraform-hcloud-kube-hetzner#354

Closed

splattner mentioned this issue Jan 27, 2023

Try to fix proxy-protocol and http01 challenge acend/infrastructure#38

Closed

megian mentioned this issue Aug 20, 2024

Exposed load balancer IP causes cluster internal traffic fail together with the proxy protocol cloudscale-ch/cloudscale-cloud-controller-manager#15

Closed

proxy_protocol mode breaks HTTP01 challenge Check stage #466

proxy_protocol mode breaks HTTP01 challenge Check stage #466

Comments

bbetter173 commented Apr 13, 2018 • edited Loading

bbetter173 commented Apr 13, 2018

munnerz commented Apr 13, 2018

whereisaaron commented Apr 13, 2018

bbetter173 commented Apr 13, 2018 • edited Loading

bbetter173 commented Apr 13, 2018

bbetter173 commented Apr 14, 2018

metallhopf commented Mar 23, 2019

bbetter173 commented Mar 24, 2019

whereisaaron commented Mar 24, 2019

metallhopf commented Mar 24, 2019

altoning commented Mar 26, 2019

johnl commented Apr 10, 2019

Routhinator commented Apr 13, 2019

bjg2 commented Apr 22, 2019

altoning commented Apr 22, 2019

bjg2 commented Apr 23, 2019

dottodot commented May 2, 2019

johnl commented May 2, 2019

bbetter173 commented May 2, 2019

sorenmat commented May 6, 2019

whereisaaron commented May 7, 2019

Philio commented Jul 18, 2019

sorenmat commented Jul 18, 2019 • edited Loading

Philio commented Jul 18, 2019 • edited Loading

MichaelOrtho commented Jul 19, 2019 • edited Loading

kinolaev commented Jul 19, 2019 • edited Loading

ximon18 commented May 14, 2020

bots-business commented May 14, 2020

bukowa commented Jul 29, 2020

compumike commented Oct 23, 2020

xavier-rodet commented Nov 23, 2020 • edited Loading

kskalski commented Nov 27, 2020

KeksBeskvitovich commented Mar 22, 2021 • edited Loading

Stone624 commented Apr 4, 2021

davidpestana commented Apr 7, 2021

Miniland1333 commented Apr 30, 2021

namelessvoid commented May 11, 2021 • edited Loading

namelessvoid commented May 17, 2021

MFrancesco commented Jul 23, 2021

sureshkachwa commented Oct 19, 2021

wadexu007 commented Jan 5, 2022

wadexu007 commented Jan 6, 2022

kquinsland commented Apr 27, 2022

c4tz commented Oct 26, 2022

aemakeye commented Jan 12, 2023

Stringls commented Aug 30, 2023

hegerdes commented Nov 20, 2023

abhijith-b commented Dec 8, 2023

frankforpresident commented Mar 28, 2024

bbetter173 commented Apr 13, 2018 •

edited

Loading

bbetter173 commented Apr 13, 2018 •

edited

Loading

sorenmat commented Jul 18, 2019 •

edited

Loading

Philio commented Jul 18, 2019 •

edited

Loading

MichaelOrtho commented Jul 19, 2019 •

edited

Loading

kinolaev commented Jul 19, 2019 •

edited

Loading

xavier-rodet commented Nov 23, 2020 •

edited

Loading

KeksBeskvitovich commented Mar 22, 2021 •

edited

Loading

namelessvoid commented May 11, 2021 •

edited

Loading