K8s server and agent nodeattestor plugin #557

enricoschiattarella · 2018-08-06T22:00:33Z

Pull Request check list

[ X] Commit conforms to CONTRIBUTING.md?
[ X] Proper tests/regressions included?
[ X] Documentation updated?

Affected functionality

Description of change

Provides a new node attestation plugin for server and agent aimed at Kubernetes environments.
The agent plugin uses the Kubernetes Certificates API to submit a CSR and get it signed by Kubernetes API Server. Kubernetes API Server signs it using its CA and returns the certificate to the agent.
The agent presents the certificate to the server as an identity document.
Server verifies that agent actually possesses corresponding private key and returns a SVID if validation succeeds.

NOTE: The change in glide.yaml (moving package github.com/stretchr/testify from testImport to import section) was done to work around a glide issue. Without that change, glide would report an error " Failed to generate lock file: Generating lock produced conflicting versions of github.com/stretchr/testify. import (~1.1.3), testImport (~1.1.0)"

There are a couple of open glide issues related to this:
Masterminds/glide#597
Masterminds/glide#564

Another way to work around it would be to not use glide for k8s.io/client-go dependency but do go get ... from Makefile/build script

azdagron · 2018-08-09T17:19:01Z

doc/plugin_agent_nodeattestor_k8s.md

+to access the Kubernetes API Server and submit a Certificate Signing Request (CSR).
+The credentials consist of a private key and a certificate stored on disks.
+It also needs a root certificate to validate the TLS certificate presented 
+ny the Kubernetes API Server.


by the Kubernetes...

azdagron · 2018-08-09T17:19:18Z

doc/plugin_agent_nodeattestor_k8s.md

+
+In order to retrieve the identity document, the plugin needs user credentials
+to access the Kubernetes API Server and submit a Certificate Signing Request (CSR).
+The credentials consist of a private key and a certificate stored on disks.


stored on disk.

azdagron · 2018-08-09T18:15:53Z

pkg/agent/plugin/nodeattestor/k8s/k8s.go

+func (p *K8sPlugin) setConfig(c *K8sConfig) {
+	p.m.Lock()
+	defer p.m.Unlock()
+	p.c = c


this should reset p.kubeClient so that it can be recreated with a potentially changed config

azdagron · 2018-08-09T18:18:21Z

pkg/agent/plugin/nodeattestor/k8s/k8s.go

+		return nil, errors.New("k8s: not configured")
+	}
+
+	if p.kubeClient == nil {


this is racy in the face of more than one attestation happening at a time. i would suggest moving the client initialization into the Configure call. that would do two things 1) fails the configure call if bad paths are provided, and 2) allows you to provide the client as part of the configuration which is retrieved under a log (i.e. getConfig)

evan2645

Thanks for this work @enricoschiattarella!! Left a few comments, curious to hear your thoughts on them

evan2645 · 2018-08-09T22:25:21Z

doc/plugin_agent_nodeattestor_k8s.md

+and is in the form:
+
+```
+spiffe://<trust domain>/spire/agent/k8s/system:node:<host name>


The : character is defined as reserved in RFC 3986. SPIFFE IDs must conform with this RFC. It states that if reserved characters are used outside of their special meaning(s), then they must be percent-encoded.

I recommend simply avoiding the use of : altogether

Right. I liked the idea of carrying-over opaquely from Kubernetes RBAC encoding but I 100% agree that this is looking for trouble. If it looks ok, I will change it to "/spire/agent/k8s/system/node/hostname".
The reason to maintain the "system/node" part is that if we ever use Kubernetes to attest anything else using the certs api, there will be no conflicts.

evan2645 · 2018-08-09T22:26:50Z

doc/plugin_agent_nodeattestor_k8s.md

+| `k8s_private_key_path` | The path to the private key on disk (PEM encoded PKCS1 or PKCS8) used to authenticate the agent to the Kubernetes API Server| |
+| `k8s_certificate_path` | The path to the certificate bundle on disk. Used to authenticate the agent to the Kubernetes API Server | |
+| `k8s_ca_certificate_path` | The root certificate used to validate the certificate presented by the Kubernetes API Server | |
+| `kubeconfig_path` | Optional. The path to the kubeconfig file containing Kubernetes cluster access information | |


Why is this optional, and what do I get if I specify it?

There is a "standard" sequence to determine the location kubeconfig, if not provided explicitly.
First try env variable KUBECONFIG, then try home directory, then return error.
It is the sequence followed by kubectl.
I will add an explanation.

evan2645 · 2018-08-09T23:04:41Z

pkg/agent/plugin/nodeattestor/k8s/k8s.go

+func getAgentName() string {
+	name, err := os.Hostname()
+	if err != nil {
+		name = "unknown"


There is an assumption in SPIRE that all agents can be uniquely addressed... as such, I think it would be best to raise an error here instead of set unknown since it seems that this name is ultimately used in the agent's spiffe id? Also brings up questions about hosts that might have the same hostname? Curious to hear your thoughts on this in the context of kubernetes cluster

More info here #558

In a Kubernetes environment we can rely on each node (and pod) having a unique IP address because essentially NAT is prohibited, so if the trust domain coincides with the Kubernetes cluster, we could use the IP address.

https://kubernetes.io/docs/concepts/cluster-administration/networking/

If the trust domain includes multiple Kubernetes clusters, then I think the best we can do is to include a fingerprint of the CA certificate associated to the kubernetes cluster.
Something like "spire/agent/k8s/cluster/XXXX/system:node:".

As a last resort, we can completely forget about the hostname and use a fingerprint of the cert, same as the x509pop attestation does.

I'm not sure what's the current thinking is on the question you referred.
Agent naming uniqueness is a nice property...
Human-readable SVIDs are nice too...

I will be happy to implement whatever you guys believe is the best overall solution.

We discussed this topic on the SIG-Kubernetes call today. We concluded that it is probably OK for the resulting ID to be less human friendly than it could be. It is likely that a SPIRE server will manage multiple kubernetes clusters, so including the thumbprint of the CA seems like a good idea. The proposed path including thumbprint and node name is 👍.

Server side node attestors can optionally kick back selectors which can be used to group agents together. In this case, it would be awesome if we provide a selector for CA thumbprint k8s:ca_thumbprint:XXXXX. Doing so would allow users to create SPIFFE IDs which address entire clusters, allowing them to register workloads to run in one cluster or another.

evan2645 · 2018-08-09T23:06:06Z

pkg/agent/plugin/nodeattestor/k8s/k8s.go

+			// Still no luck, try default (home)
+			home := os.Getenv("HOME")
+			if home != "" {
+				kubeConfigFilePath = path.Join(home, ".kube", "config")


It would be good if we can document this search behavior

evan2645 · 2018-08-09T23:08:16Z

pkg/agent/plugin/nodeattestor/k8s/k8s.go

+		return nil, fmt.Errorf("Error creating private key: %v", err)
+	}
+
+	certsIntf := kubeClient.CertificatesV1beta1().CertificateSigningRequests()


I noticed the beta1 suffix here - how stable is this API? Do we expect it to be the same a year or more from now?

The feature that depends on this API is going GA in the next release, so in the long term the mechanics of node attestation should be stable.
The API itself will not graduate yet, though, so minor changes are possible.
This is the issue for tracking and discussion:

kubernetes/enhancements#43 (comment)

evan2645 · 2018-08-09T23:09:44Z

pkg/agent/plugin/nodeattestor/k8s/k8s.go

+	}
+
+	certsIntf := kubeClient.CertificatesV1beta1().CertificateSigningRequests()
+	cert, err := csr.RequestNodeCertificate(certsIntf, key, types.NodeName(getAgentName()))


Does the CSR signing process in kubernetes do anything to assert the correct value in NodeName? In other words, could I get kubernetes to sign a CSR with the name of a different node here?

Right now it does not. It just checks that CN conforms to the expected format and key usage is what is expected. There are plans to strengthen this and support proof-of-ownership, TPMs, etc.

Options for improving over the baseline are:

Disable automatic CSR approval, rely on Kubernetes admin to verify the content of the CSR manually before approving it

Create a custom CSR signer that will perform extra checks/challenges

K8s attestor in SPIRE server checks the IP address of the agent connection and compares it against what is in the Kubernetes certificate

I think all options are doable but IMO, unless this is a showstopper, the best course of action is to live with the limitation for the time being and leverage the improvements of the Kubernetes node attestation mechanism as they become available.

We discussed this topic on the SIG-Kubernetes call today. The biggest drawback here is that we are essentially asserting the ability of the caller to get a CSR approved, and not asserting anything about the actual identity of the caller. In certain situations, this can result in a non-agent workload being able to obtain the identities of all other workloads in the cluster.

To mitigate this, we should try to assert that the caller is in fact an agent. We discussed using a service account token to do this, configuring the server node attestor with the namespace and service account that the agent runs as. This token can either 1) be used as credentials to submit the CSR, and verified by the server, or 2) passed directly to the server.

Taking this approach gives stronger assurance that the caller we are attesting is an agent, however it still allows for a dishonest agent (or anything that can run as the named service account) to claim that it is a different node/agent. To mitigate this, we could use the kubelet identity to submit the CSR rather than the service account. The server attestor can then verify that the kubelet identity which submitted the CSR matches the SPIFFE ID we are about to issue. IMO, this is the ideal approach, and provides the strongest level of attestation of all the options. @enricoschiattarella has mentioned that obtaining the kubelet identity may be non-trivial, but perhaps it warrants some further research.

To avoid the non-agents-could-obtain-arbitrary-identities problem, we should either add the service account token logic, or use the kubelet identity directly for CSR submission, with a preference for the latter. If we choose the service account path, it probably warrants a direct call out in the plugin documentation (perhaps in the form of a Security Considerations section?). SPIRE is known for its ability to strongly attest and issue identity, but some sacrifices are made in SA token approach (namely, that we are not actually asserting the identity of a node, rather we are asserting that the caller lives in a particular service account).

Some related notes:

Using SA token for CSR submission instead of sending it directly to the server avoids the need to pass a SA token to a system "outside" of the kubernetes control plane

@mattmoyer informed us that the identity used to submit the CSR is included in the signed certificate - this could be directly extracted by the server attestor for validation, as opposed to having to hit the api server to interrogate the identity of the submitter

Node identity may be embedded in SA tokens in the future, it is unclear whether the node identity will 1) be included in resulting signed certificates, 2) be discoverable through interrogation of the API server, looking at the CSR approval stuff. If neither 1 or 2, we would end up having to pass the token directly to SPIRE anyways.

evan2645 · 2018-08-09T23:11:29Z

pkg/agent/plugin/nodeattestor/k8s/k8s.go

+func (p *K8sPlugin) loadConfigData() (*configData, error) {
+	config := p.getConfig()
+	if config == nil {
+		return nil, errors.New("k8s: not configured")


nit: might be a good idea to be more specific with these errors, like k8s node attestor: not configured or something

evan2645 · 2018-08-09T23:18:40Z

pkg/server/plugin/nodeattestor/k8s/k8s.go

+		return newError("not configured")
+	}
+
+	if dataType := req.AttestationData.Type; dataType != pluginName {


I think we need to ensure that req.AttestationData is not nil?

will add the check

evan2645 · 2018-08-09T23:19:29Z

pkg/server/plugin/nodeattestor/k8s/k8s.go

+	}
+
+	attestationData := new(x509pop.AttestationData)
+	if err := json.Unmarshal(req.AttestationData.Data, attestationData); err != nil {


Same comment here - what if req.AttestationData is nil?

evan2645 · 2018-08-09T23:45:15Z

pkg/server/plugin/nodeattestor/k8s/k8s.go

+	}
+
+	// receive and validate the challenge response
+	responseReq, err := stream.Recv()


Hmmm... I am curious if this can be abused by a client that never sends the challenge response... or triggered by clients which crash before being able to do so? Perhaps this Recv should be guarded with a select block on context cancellation?

@azdagron does the streaming shim have a default timeout on the context here? Is this a problem elsewhere?

The Recv() call should return with an error in the case of context cancellation. You bring up a good point though. We should probably define timeouts for all plugin related calls. I don't think there is anything specific to do in the plugin code.

enricoschiattarella · 2018-08-13T23:53:23Z

Hi @azdagron and @evan2645, thanks for reviewing.

I have uploaded a new version of the code that incorporates your comments, except the one regarding uniqueness of agent names.

PTAL a look. We can continue here or discuss during the next K8s SIG call (Friday 8/17 10AM).

azdagron · 2018-10-04T22:31:43Z

After discussion, we came to consensus that for now using the service account token was good enough and likely to get better if more identity components end up included in the service account token (like node name, for example). I've rebased this on top of current master and pushed a commit that transitions the code to use the service account token instead.

evan2645 · 2018-10-12T22:17:05Z

doc/plugin_server_nodeattestor_k8s_sat.md

+The `k8s_sat` plugin attests nodes running in inside of Kubernetes. The server
+validates the signed service account token provided by the agent. It extracts
+the service account name and namespace from the token claims. The server uses a
+one-time UUID provided by the agent for generate a SPIFFE ID with the form:


"by the agent to generate"

doc/plugin_agent_nodeattestor_k8s_sat.md

evan2645 · 2018-10-12T22:25:07Z

doc/plugin_agent_nodeattestor_k8s_sat.md

+The SPIFFE ID with the form:
+
+```
+spiffe://<trust domain>/spire/agent/k8s_sat/<UUID>


should cluster name be included here?

evan2645 · 2018-10-12T22:29:48Z

doc/plugin_server_nodeattestor_k8s_sat.md

+
+In addition, this plugin generates the following selectors of type `k8s_sat` for each node:
+
+| Value                       | Example                       | Description                                |


The notion that selectors are typed is something that is not necessarily exposed to users. For instance, the registration CLI takes -selector k8s_sat:cluster-name:MyCluster. In the interest of simplicity, I'd lean towards s/Value/Selector/ and include the k8s_sat type prefix

doc/plugin_agent_nodeattestor_k8s_sat.md

evan2645 · 2018-10-15T19:23:39Z

pkg/agent/plugin/nodeattestor/k8s/sat.go

+
+	token, err := loadTokenFromFile(config.tokenPath)
+	if err != nil {
+		return satError.New("unable to get token value: %v", err)


nit: perhaps "unable to read token from %v: %v", config.tokenPath, err or something similar will provide more information?

evan2645 · 2018-10-15T19:48:22Z

pkg/server/plugin/nodeattestor/k8s/sat.go

+
+func verifyTokenSignature(clusters []*clusterConfig, token *jwt.JSONWebToken) (cluster *clusterConfig, claims *k8s.SATClaims, err error) {
+	var lastErr error
+	for _, cluster := range clusters {


if we make users configure the cluster name on the agent attestor, maybe we could avoid this bit

👍 good call

evan2645 · 2018-10-15T19:49:47Z

pkg/server/plugin/nodeattestor/k8s/sat.go

+	p.config = config
+}
+
+func verifyTokenSignature(clusters []*clusterConfig, token *jwt.JSONWebToken) (cluster *clusterConfig, claims *k8s.SATClaims, err error) {


Do we need to check that alg is correctly set somewhere in here?

verifiers in square's go-jose library make sure that the algorithm in the token lines up with the verification key that is passed. we're good here.

evan2645 · 2018-10-15T19:54:30Z

pkg/server/plugin/nodeattestor/k8s/sat.go

+		Selectors: []*common.Selector{
+			makeSelector("cluster:name", cluster.name),
+			makeSelector("service-account:namespace", claims.Namespace),
+			makeSelector("service-account:name", claims.ServiceAccountName),


any chance we could simplify this by dropping the suffixes?

k8s_sat:cluster:foo
k8s_sat:service-account:bar
k8s_sat:namespace:baz

evan2645 · 2018-10-15T19:55:50Z

pkg/server/plugin/nodeattestor/k8s/sat.go

+		BaseSPIFFEID: k8s.AgentID(config.trustDomain, attestationData.UUID),
+		Selectors: []*common.Selector{
+			makeSelector("cluster:name", cluster.name),
+			makeSelector("service-account:namespace", claims.Namespace),


the plugin name (and hence beginning of all selectors) is k8s_sat which makes me lean towards snake case for service account too

k8s_sat:service_account

Signed-off-by: Enrico Schiattarella <[email protected]>

Signed-off-by: Andrew Harding <[email protected]>

evan2645

🚴 🌾

enricoschiattarella requested review from ajessup, amartinezfayo, azdagron, drrt, evan2645, MarcosDY, martincapello and walmav as code owners August 6, 2018 22:00

enricoschiattarella force-pushed the nodeattestor branch 2 times, most recently from 3c385ed to fabc04a Compare August 6, 2018 22:02

azdagron reviewed Aug 9, 2018

View reviewed changes

evan2645 reviewed Aug 9, 2018

View reviewed changes

enricoschiattarella force-pushed the nodeattestor branch 2 times, most recently from ecc354e to 8b9b152 Compare August 13, 2018 23:12

azdagron force-pushed the nodeattestor branch from 8b9b152 to db029d4 Compare October 4, 2018 22:29

evan2645 reviewed Oct 15, 2018

View reviewed changes

enricoschiattarella and others added 3 commits October 18, 2018 11:36

K8s server and agent nodeattestor plugin

11d8d14

Signed-off-by: Enrico Schiattarella <[email protected]>

switch k8s node attestor to use service account token

9ddd458

Signed-off-by: Andrew Harding <[email protected]>

address PR comments

f72dd3a

Signed-off-by: Andrew Harding <[email protected]>

azdagron force-pushed the nodeattestor branch from dcc81fd to f72dd3a Compare October 18, 2018 17:37

azdagron added 2 commits October 19, 2018 15:05

fix error message

b133029

Signed-off-by: Andrew Harding <[email protected]>

Merge branch 'master' into nodeattestor

73f4a67

evan2645 approved these changes Oct 19, 2018

View reviewed changes

azdagron merged commit b47a0ba into spiffe:master Oct 19, 2018


		In addition, this plugin generates the following selectors of type `k8s_sat` for each node:

		\| Value \| Example \| Description \|

K8s server and agent nodeattestor plugin #557

K8s server and agent nodeattestor plugin #557

Conversation

enricoschiattarella commented Aug 6, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

azdagron Aug 9, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

evan2645 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

enricoschiattarella commented Aug 13, 2018

azdagron commented Oct 4, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

evan2645 left a comment

Choose a reason for hiding this comment

azdagron Aug 9, 2018 •

edited

Loading