Consul service deregistration gets "ACL not found" w/ Workload Identities #23494
Labels
hcc/jira
stage/accepted
Confirmed, and intend to work on. No timeline committment though.
stage/duplicate
theme/consul
theme/workload-identity
type/bug
Product Version
OS : CentOS Linux release 7.9.2009
Nomad Version : v1.7.3
Consul Version: v1.17.3
Issue Description
We are getting :"Unexpected response code: 403 (ACL not found)" error in nomad logs and due to which services in consul is not getting registered/unregistered unless nomad service is restarted.
In our Nomad cluster we have around 100+ Jobs running and this issue is observed intermittently on one/two Job. Every time its a different Job that gets Impacted.
we are using the JWT mechanism outlined in Consul ACL with Nomad Workload Identities | Nomad | HashiCorp Developer to authenticate Nomad workloads against Consul.
ACL not found Issue is observed only in Test and UAT Environment where we have enabled WI to integrate with Consul.
We verified the Token used for service check from /opt/consul/checks(consul data dir) and found out that token used in check is no longer available in the consul.
And the issue gets fixed only after restart the nomad service.
Reproduction Steps
Tried to reproduce the issue in local dev environment , However in local the service registration and de-registration is happening as expected.
Nomad server and client config snippet
Nomad client:
Nomad Server:
Nomad log snippet:
The text was updated successfully, but these errors were encountered: