Adding support for server to authenticate agent#51
Adding support for server to authenticate agent#51k8s-ci-robot merged 1 commit intokubernetes-sigs:masterfrom
Conversation
26cf0c1 to
5762da5
Compare
|
/test pull-apiserver-network-proxy-test |
|
/assign @caesarxuchao @Jefftree |
cmd/agent/main.go
Outdated
| return fmt.Errorf("proxy server port %d must be greater than 0", o.proxyServerPort) | ||
| } | ||
| if o.saToken != "" { | ||
| if _, err := os.Stat(o.saToken); os.IsNotExist(err) { |
There was a problem hiding this comment.
We shouldn't ignore other types of error.
There was a problem hiding this comment.
We do exactly same validation for all other files (agentCert, agentKey, etc..). I would prefer to keep it as is and we can refactor entire project's validation methods in separate PR
| } | ||
|
|
||
| if !r.Status.Authenticated { | ||
| return fmt.Errorf("lookup failed: service account jwt not valid") |
| CLUSTER_KEY=/etc/srv/kubernetes/pki/apiserver.key | ||
| ``` | ||
|
|
||
| # Register SERVER_TOKEN in [static-token-file](https://kubernetes.io/docs/reference/access-authn-authz/authentication/#static-token-file) |
There was a problem hiding this comment.
Is "static-token" the standard way to authenticate a process running in the master node?
I think we can run the proxy server as a static pod, and then use a service account to authenticate it.
There was a problem hiding this comment.
Yes, this pattern used across all other static pods. Ex: https://github.com/kubernetes/kubernetes/blob/c14106ad1234742da80eb8f12ddcbf19dba61284/cluster/gce/gci/configure-helper.sh#L613-L615
caesarxuchao
left a comment
There was a problem hiding this comment.
A few more nits.
@Jefftree @dberkov have you manually tested it in GCE/GKE?
@dberkov I understand that it's difficult to write a complete test because we don't have the test framework that runs a k8s cluster. Can you add an integration test to tests/ to verify that the proxy-server denies the Connect request if the agent doesn't send a bearer token at all? That doesn't require a k8s cluster running.
|
@dberkov @caesarxuchao: Since network proxy promises MTLS, can we enforce that either a cert or token must be sent by the proxy agent? If no token is sent, the konnectivity-server will reject the request and unregister the connection, but the konnectivity-agent pod will be in a infinite crash loop state, and keep trying to connect to the server with no token info supplied. |
I have created pkg/agent/agentserver/server_test.go with full test coverage on new behavior on server side |
|
/test pull-apiserver-network-proxy-test |
2 similar comments
|
/test pull-apiserver-network-proxy-test |
|
/test pull-apiserver-network-proxy-test |
cmd/agent/main.go
Outdated
| } | ||
|
|
||
| if o.agentCert == "" && o.agentKey == "" && o.serviceAccountTokenPath == "" { | ||
| return fmt.Errorf("agent must enable certificate based or token based authentication") |
There was a problem hiding this comment.
I believe this should be enforced from the server side and not the client side. There are legitimate non production use cases for turning off client authentication.
There was a problem hiding this comment.
In addition if we allow this through I think we don't need the mock agent as the real agent can do what is necessary.
There was a problem hiding this comment.
We may want to add an enum setting to the proxy/main.go at this point for requiredAgentAuth? That way we can detect if the agent is meeting the minimum authentication requirements.
There was a problem hiding this comment.
I brought this up because an incorrectly configured konnectivity-agent would send a large number of denied requests to the konnectivity-server. (agent retries + multiple threads hitting LB for regional clusters + CrashLoop retries).
Since there are use cases for turning off client authentication, removing this check is fine, but we should still think of ways to limit the outgoing requests of an incorrectly configured agent.
There was a problem hiding this comment.
I removed this validation
|
/assign @mikedanese |
|
@cheftako: GitHub didn't allow me to assign the following users: mikedanese. Note that only kubernetes-sigs members, repo collaborators and people who have commented on this issue/PR can be assigned. Additionally, issues/PRs can only have 10 assignees at the same time. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
caesarxuchao
left a comment
There was a problem hiding this comment.
The gomock is cool. I wish I knew it earlier.
| } | ||
| if o.authenticationAudience == "" { | ||
| return fmt.Errorf("authenticationAudience cannot be empty when agent authentication is enabled") | ||
| } |
There was a problem hiding this comment.
Also check if o.kubeconfigPath==nil?
There was a problem hiding this comment.
It cannot be nil, since we have newProxyRunOptions().
| sources: | ||
| - serviceAccountToken: | ||
| path: konnectivity-agent-token | ||
| audience: system:konnectivity-server No newline at end of file |
There was a problem hiding this comment.
I guess this "audience" value gets encoded into the token?
There was a problem hiding this comment.
KubernetesClient.AuthenticationV1().TokenReviews() gets the audience is the parameters, so k8s API validates that token is issued with this audience
There was a problem hiding this comment.
But konnectivity-server calls TokenReviews, and this is the yaml for the konnectivity-agent.
My guess is the token data mounted by the agent will contain the "audience", and apiserver will be able to extract the "audience" out from the token.
7408e7e to
c1c62a6
Compare
|
/lgtm |
| serverCount uint | ||
| // Agent pod's namespace for token-based agent authentication | ||
| agentNamespace string | ||
| // Agent pod's service account for token-based agent authentication |
There was a problem hiding this comment.
Will we ever want different service accounts for different agents? Eg. agent service account per failure domain?
| // all 4 parametes must be empty or must have value (except kubeconfigPath that might be empty) | ||
| if o.agentNamespace != "" || o.agentServiceAccount != "" || o.authenticationAudience != "" || o.kubeconfigPath != "" { | ||
| if o.agentNamespace == "" { | ||
| return fmt.Errorf("agentNamespace cannot be empty when agent authentication is enabled") |
There was a problem hiding this comment.
For future we should consider accumulating these errors. Sort of annoying to be given and error I need a service account on run1 and then be given an error I need an audience on run2.
cheftako
left a comment
There was a problem hiding this comment.
Please fix the GKE reference.
|
/lgtm |
|
/approve |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: cheftako, dberkov The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
…odules/github.com/stretchr/testify-1.8.0 Build(deps): bump github.com/stretchr/testify from 1.7.5 to 1.8.0
The PR allows to proxy-server authenticate proxy-agent.
Agent:
Agent sends in GRPC metadata token associated to the pod by kubernetes system.
Server
Server as part of the connection opening step, reads the token sent by agent, invokes kubernetes.TokenReviews API, checks the token is valid and belongs to agent's pod by validating namespace + service account of the token's owner .
General
All examples/kubernetes/* templates and READ.me procedure has been updated for supporting fully working e2e test of this feature in kubernetes.