Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Let's Encrypt via DNS-01 has a race condition with "Pending" responses #6185

Open
tgharold opened this issue Jan 4, 2025 · 2 comments
Open

Comments

@tgharold
Copy link

tgharold commented Jan 4, 2025

Tested using VER=3.1.0 of the script on a pfSense firewall.

Steps to reproduce

  • Be using 'DNS-01' validation method.
  • Have an API key with your DNS provider (e.g. DNSMadeEasy).
  • Verify that the API key is working and that the TXT records are being created.
  • Do not specify any --dnssleep values (instead relying on the built-in loop).
  • Attempt to renew the cert during a busy period for the Let's Encrypt CA servers.

Observe the following in the logs:

[Fri Jan  3 13:13:39 EST 2025] Using CA: https://acme-v02.api.letsencrypt.org/directory
[Fri Jan  3 13:13:39 EST 2025] Registering account: https://acme-v02.api.letsencrypt.org/directory
[Fri Jan  3 13:13:40 EST 2025] Already registered
[Fri Jan  3 13:13:40 EST 2025] ACCOUNT_THUMBPRINT='OnC_mW9gqk6qbqBSoAJczXRx7HmBCsr2dA6wAxAhFUI'
[Fri Jan  3 13:13:40 EST 2025] Using pre-generated key: /tmp/acme/LetsEncrypt2023PfSense/pf.home.example.com/pf.home.example.com.key.next
[Fri Jan  3 13:13:40 EST 2025] Generating next pre-generate key.
[Fri Jan  3 13:13:41 EST 2025] Single domain='pf.home.example.com'
[Fri Jan  3 13:13:41 EST 2025] Getting webroot for domain='pf.home.example.com'
[Fri Jan  3 13:13:41 EST 2025] Adding TXT value: sCyKt4QV6_COlhP8AmVlpukGIGGcWU0Zz6n0Ym6w2p0 for domain: _acme-challenge.pf.home.example.com
[Fri Jan  3 13:13:42 EST 2025] Adding record
[Fri Jan  3 13:13:43 EST 2025] Added
[Fri Jan  3 13:13:43 EST 2025] The TXT record has been successfully added.
[Fri Jan  3 13:13:43 EST 2025] Let's check each DNS record now. Sleeping for 20 seconds first.
[Fri Jan  3 13:14:03 EST 2025] You can use '--dnssleep' to disable public dns checks.
[Fri Jan  3 13:14:03 EST 2025] See: https://github.com/acmesh-official/acme.sh/wiki/dnscheck
[Fri Jan  3 13:14:03 EST 2025] Checking pf.home.example.com for _acme-challenge.pf.home.example.com
[Fri Jan  3 13:14:04 EST 2025] Success for domain pf.home.example.com '_acme-challenge.pf.home.example.com'.
[Fri Jan  3 13:14:04 EST 2025] All checks succeeded
[Fri Jan  3 13:14:04 EST 2025] Verifying: pf.home.example.com
[Fri Jan  3 13:14:04 EST 2025] Pending. The CA is processing your order, please wait. (1/30)
[Fri Jan  3 13:14:06 EST 2025] Removing DNS records.
[Fri Jan  3 13:14:06 EST 2025] Removing txt: sCyKt4QV6_COlhP8AmVlpukGIGGcWU0Zz6n0Ym6w2p0 for domain: _acme-challenge.pf.home.example.com
[Fri Jan  3 13:14:08 EST 2025] Successfully removed
[Fri Jan  3 13:14:06 EST 2025] pf.home.example.com: Invalid status. Verification error details: Incorrect TXT record 
[Fri Jan  3 13:14:08 EST 2025] Please check log file for more details: /tmp/acme/LetsEncrypt2023PfSense/acme_issuecert.log

Notice that we are successfully creating and deleting the TXT record at our DNS provider. The root cause is that the script does things so fast that the DNS record does not have time to propagate to the DNS response over at the Let's Encrypt servers. Unlike with HTTP based checks, DNS checks are "eventually consistent" -- so there's a need to retry a few times in case the change has not yet propagated.

The response coming back that causes the "Invalid status." error is a 400. Instead of waiting and retrying again, we're getting kicked out of the loop that should be re-checking.

Workaround

Set --dnssleep to a large enough value (60 to 180 seconds) that you're guaranteed that the newly created DNS record will be visible to the Let's Encrypt servers.

Proper Fix

If we are in DNS-01 mode, we should only be triggering this if block that kicks us out of the loop if we have exceeded MAX_RETRY_TIMES. Otherwise we should be sleeping and trying again because maybe the DNS record is not yet visible to the Let's Encrypt servers.

Copy link

github-actions bot commented Jan 4, 2025

Please upgrade to the latest code and try again first. Maybe it's already fixed. acme.sh --upgrade If it's still not working, please provide the log with --debug 2, otherwise, nobody can help you.

@Neilpang
Copy link
Member

Neilpang commented Jan 4, 2025

actually, using the --dnssleep with a enough long time for the TTL time of your dns provider is the only reliable way for now.

It's intensionally not clear that how the CA would check the dns records. So, there could be cache in the dns server that the CA would check with.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants