Write audit policy file for GCE/GKE configuration #46897

timstclair · 2017-06-03T01:21:36Z

Setup the audit policy configuration for GCE & GKE. Here is the high level summary of the policy:

Default logging everything at Metadata
Known write APIs default to RequestResponse
Known read-only APIs default to Request
Except secrets & configmaps are logged at Metadata
Don't log events
Don't log /version, swagger or healthchecks

In addition to the above, I spent time analyzing the noisiest lines in the audit log from a cluster that soaked for 24 hours (and ran a batch of e2e tests). Of those top requests, those that were identified as low-risk (all read-only, except update kube-system endpoints by controllers) are dropped.

I suspect we'll want to tweak this a bit more once we've had a time to soak it on some real clusters.

For kubernetes/enhancements#22

/cc @sttts @ericchiang

ericchiang

Thanks for doing this. It'll be great for other deployment tools to reference when they go to turn this on.

ericchiang · 2017-06-05T14:04:31Z

cluster/gce/gci/configure-helper.sh

+      - group: "" # core
+        resources: ["endpoints", "services"]
+  - level: None
+    users: ["system:unsecured"]


system:unsecured is high-volume? Isn't that the insecure port?

Perhaps 'high-volume or low-risk'?

Added a comment and filed #46983. If you'd prefer I remove this until the issue is fixed I can do that.

BTW - This specific request is very high volume. I left an empty cluster sitting over the weekend, and this was the 5th highest request category:

$ sqlite3 -header ~/logs/audit-policy/super-audit-2.db "SELECT SUM(count) AS count, user, verb, apigroup, namespace, resource, name, subresource FROM audit WHERE response != '403' GROUP BY user, verb, apigroup, resource, name, subresource ORDER BY count DESC LIMIT 5;" | column -tn -s "|" count user verb apigroup namespace resource name subresource 89069 system:kube-controller-manager update core kube-system endpoints kube-controller-manager 89067 system:kube-controller-manager get core kube-system endpoints kube-controller-manager 89063 system:kube-scheduler update core kube-system endpoints kube-scheduler 89061 system:kube-scheduler get core kube-system endpoints kube-scheduler 71446 system:unsecured get core kube-system configmaps ingress-uid

(Using https://github.com/timstclair/kube-contrib/blob/devel/devel/scripts/auditdb.go for the database)

Audit logging already paying off!

Let's just wait for #46983

EDIT. I mean let's just merge as is and have the fix for #46983 change it.

ericchiang · 2017-06-05T14:05:39Z

cluster/gce/gci/configure-helper.sh

+      - group: "" # core
+        resources: ["configmaps"]
+  - level: None
+    users: ["kubelet"]


With TLS bootstrapping that will start giving kubelet credentials unique usernames, maybe target the kubelet group instead of the username?

Done. Left this for legacy upgrades though.

ericchiang · 2017-06-05T14:06:30Z

cluster/gce/gci/configure-helper.sh

+      - group: "" # core
+        resources: ["events"]
+
+  # Secrets & ConfigMaps can contain sensitive & binary data,


It would be nice to eventually log configmap data for auditing config pushes.

I think there are a couple problems with logging configmap data:

It could be arbitrarily large, which might not play nice with the auditing backend

It could be binary data. Not explicitly a problem, but makes the audit logs more difficult to process

Could contain arbitrary user data, such as PII

ericchiang · 2017-06-05T14:10:34Z

cluster/gce/gci/configure-helper.sh

+    resources: ${known_apis}
+  # Default level for known APIs
+  - level: RequestResponse
+    resources: ${known_apis}


Sorry if I have to be reminded but this works with the rule above it because events go through this list in order until they match something right?

ikehz · 2017-06-05T18:12:25Z

lgtm

timstclair

Thanks @ericchiang

timstclair · 2017-06-05T18:44:50Z

cluster/gce/gci/configure-helper.sh

+      - group: "" # core
+        resources: ["endpoints", "services"]
+  - level: None
+    users: ["system:unsecured"]


Added a comment and filed #46983. If you'd prefer I remove this until the issue is fixed I can do that.

timstclair · 2017-06-05T18:45:02Z

cluster/gce/gci/configure-helper.sh

+      - group: "" # core
+        resources: ["configmaps"]
+  - level: None
+    users: ["kubelet"]


Done. Left this for legacy upgrades though.

timstclair · 2017-06-05T18:45:22Z

cluster/gce/gci/configure-helper.sh

+    resources: ${known_apis}
+  # Default level for known APIs
+  - level: RequestResponse
+    resources: ${known_apis}


ericchiang · 2017-06-05T19:22:50Z

/lgtm

timstclair · 2017-06-05T19:25:44Z

FYI, here is some data on the dropped requests. This comes from an unloaded 3-node cluster running for 3 days, plus a single default-suite e2e run. The logs were analyzed using https://github.com/timstclair/kube-contrib/blob/devel/devel/scripts/auditdb.go (only looks at RequestComplete stage)

Total requests: 1,294,914
Requests caught by rules:

kube-proxy rule: 2,388
ingress (unsecured) rule: 74,430
kubelet rule: 71,409
controller-manager/scheduler/endpoint-controller: 412,174
apiserver rule: 24,169
Total dropped requests: 584,570 (45% of total)

And for fun:

sqlite> SELECT SUM(count) AS count FROM audit WHERE user = 'system:anonymous' AND response IN ('401', '403');
count
6612

I.e. 6612 probing requests from the anonymous internet

mikedanese · 2017-06-05T19:26:40Z

/approve

k8s-github-robot · 2017-06-05T19:27:09Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: ericchiang, mikedanese, timstclair

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these OWNERS Files:

~~cluster/gce/OWNERS~~ [mikedanese]

You can indicate your approval by writing /approve in a comment
You can cancel your approval by writing /approve cancel in a comment

timstclair · 2017-06-05T19:29:02Z

Squashed. Reapplying LGTM.

nicksardo · 2017-06-05T19:29:15Z

cluster/gce/gci/configure-helper.sh

+    # TODO(#46983): Change this to the ingress controller service account.
+    users: ["system:unsecured"]
+    namespaces: ["kube-sytem"]
+    verbs: ["get"]


The ingress controller also creates a config map.

The create is low-volume (I only saw a single request) and also has higher security implications, so I'd rather not drop it here.

timstclair · 2017-06-05T23:01:47Z

@k8s-bot pull-kubernetes-e2e-kops-aws test this

k8s-github-robot · 2017-06-06T11:08:00Z

@k8s-bot test this [submit-queue is verifying that this PR is safe to merge]

k8s-ci-robot · 2017-06-06T12:00:12Z

@timstclair: The following tests failed, say /retest to rerun them all:

Test name	Commit	Details	Rerun command
pull-kubernetes-unit	`00d52ed`	link	`@k8s-bot pull-kubernetes-unit test this`
pull-kubernetes-e2e-gce-etcd3	`00d52ed`	link	`@k8s-bot pull-kubernetes-e2e-gce-etcd3 test this`

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

k8s-github-robot · 2017-06-06T12:22:40Z

Automatic merge from submit-queue (batch tested with PRs 46897, 46899, 46864, 46854, 46875)

timstclair added area/platform/gce area/provider/gcp Issues or PRs related to gcp provider release-note-none Denotes a PR that doesn't merit a release note. labels Jun 3, 2017

timstclair added this to the v1.7 milestone Jun 3, 2017

timstclair assigned ikehz and mikedanese Jun 3, 2017

k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Jun 3, 2017

k8s-ci-robot requested review from sttts and ericchiang June 3, 2017 01:21

k8s-github-robot added the size/M Denotes a PR that changes 30-99 lines, ignoring generated files. label Jun 3, 2017

ericchiang reviewed Jun 5, 2017

View reviewed changes

timstclair commented Jun 5, 2017

View reviewed changes

k8s-ci-robot assigned ericchiang Jun 5, 2017

k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jun 5, 2017

k8s-github-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jun 5, 2017

Write audit policy file for GCE/GKE configuration

00d52ed

timstclair force-pushed the audit-policy branch from 0141fb0 to 00d52ed Compare June 5, 2017 19:28

k8s-github-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jun 5, 2017

timstclair added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jun 5, 2017

nicksardo reviewed Jun 5, 2017

View reviewed changes

k8s-github-robot added approved Indicates a PR has been approved by an approver from all required OWNERS files. and removed approved Indicates a PR has been approved by an approver from all required OWNERS files. labels Jun 5, 2017

k8s-github-robot merged commit ea4764b into kubernetes:master Jun 6, 2017

ericchiang mentioned this pull request Jun 23, 2017

add validate for advanced audit policy #47784

Merged

ericchiang mentioned this pull request Jul 26, 2017

Advanced Auditing Beta Changes #48561

Closed

7 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Write audit policy file for GCE/GKE configuration #46897

Write audit policy file for GCE/GKE configuration #46897

timstclair commented Jun 3, 2017 •

edited

Loading

ericchiang left a comment

ericchiang Jun 5, 2017

ikehz Jun 5, 2017

timstclair Jun 5, 2017

timstclair Jun 5, 2017

ericchiang Jun 5, 2017 •

edited

Loading

ericchiang Jun 5, 2017

timstclair Jun 5, 2017

ericchiang Jun 5, 2017

mikedanese Jun 5, 2017

timstclair Jun 5, 2017

ericchiang Jun 5, 2017

timstclair Jun 5, 2017

ikehz commented Jun 5, 2017

timstclair left a comment

timstclair Jun 5, 2017

timstclair Jun 5, 2017

timstclair Jun 5, 2017

ericchiang commented Jun 5, 2017

timstclair commented Jun 5, 2017 •

edited

Loading

mikedanese commented Jun 5, 2017

k8s-github-robot commented Jun 5, 2017

timstclair commented Jun 5, 2017

nicksardo Jun 5, 2017

timstclair Jun 5, 2017

timstclair commented Jun 5, 2017

k8s-github-robot commented Jun 6, 2017

k8s-ci-robot commented Jun 6, 2017

k8s-github-robot commented Jun 6, 2017

Write audit policy file for GCE/GKE configuration #46897

Write audit policy file for GCE/GKE configuration #46897

Conversation

timstclair commented Jun 3, 2017 • edited Loading

ericchiang left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ericchiang Jun 5, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ikehz commented Jun 5, 2017

timstclair left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ericchiang commented Jun 5, 2017

timstclair commented Jun 5, 2017 • edited Loading

mikedanese commented Jun 5, 2017

k8s-github-robot commented Jun 5, 2017

timstclair commented Jun 5, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

timstclair commented Jun 5, 2017

k8s-github-robot commented Jun 6, 2017

k8s-ci-robot commented Jun 6, 2017

k8s-github-robot commented Jun 6, 2017

timstclair commented Jun 3, 2017 •

edited

Loading

ericchiang Jun 5, 2017 •

edited

Loading

timstclair commented Jun 5, 2017 •

edited

Loading