Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

eks-prow-bulid-cluster: extract cluster admin IAM role to a module, fine-grained IAM permissions, docs #5113

Merged
merged 8 commits into from
Apr 24, 2023

Conversation

xmudrii
Copy link
Member

@xmudrii xmudrii commented Apr 10, 2023

This PR brings the following changes to eks-prow-build-cluster:

  • extract cluster admin IAM role to a submodule
    • This allows us to create IAM role and attach needed policies to it before proceeding with the cluster creation
    • Because of that, we can assume the created role and use it to create the cluster
      • Currently, we are creating the cluster using an IAM user, then assume the role only after the cluster is already created. This has some security implications, as the user that created the cluster is cluster administrator and owns all the resources (e.g. KMS keys)
  • replace AdministratorAccess with fine-grained IAM permissions
    • Using AdministratorAccess is risky and not recommended
    • @rothgar recommended using iamlive to record required permissions which did the trick 🎉
    • I used the CSM mode to record the permissions
    • We have two different policies: one for maintaining the cluster and one for destroying the cluster
      • The idea behind this is to increase security and give folks ability to maintain the cluster, but not to destroy it
  • README is updated with additional explanations and changes introduced in this PR

TODO:

  • Try to get even more fine-grained IAM permissions
    • I used the CSM mode, but the proxy mode might give better results, so I'll give it a try
    • This is for a follow up, let's get started with this and then we can iterate later

All changes are applied to the canary cluster. I'll apply changes to the production cluster once the PR is merged.

/assign @pkprzekwas @ameukam @dims @rothgar

Signed-off-by: Marko Mudrinić <[email protected]>
Signed-off-by: Marko Mudrinić <[email protected]>
@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. area/infra Infrastructure management, infrastructure design, code in infra/ area/infra/aws Issues or PRs related to Kubernetes AWS infrastructure sig/k8s-infra Categorizes an issue or PR as relevant to SIG K8s Infra. labels Apr 10, 2023
@k8s-ci-robot k8s-ci-robot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. approved Indicates a PR has been approved by an approver from all required OWNERS files. labels Apr 10, 2023
@xmudrii
Copy link
Member Author

xmudrii commented Apr 10, 2023

/hold
for review

@k8s-ci-robot k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Apr 10, 2023
@xmudrii
Copy link
Member Author

xmudrii commented Apr 10, 2023

/assign @sftim

@dims
Copy link
Member

dims commented Apr 11, 2023

looks good @xmudrii thanks for confirming that this is already in the canary cluster. please merge when you are ready.

/approve
/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Apr 11, 2023
@rothgar
Copy link
Member

rothgar commented Apr 11, 2023

I didn't run the make targets but the IAM looks like it does a good job separating the ability to create/manage a cluster vs delete 👍

Copy link
Contributor

@sftim sftim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some feedback; hope it helps!

Destroy:

```bash
make destroy
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this equivalent to TF_ARGS=-destroy make apply?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it should be. For the reference:

  -destroy               Destroy Terraform-managed infrastructure.
                         The command "terraform destroy" is a convenience alias
                         for this option.

Comment on lines 161 to 165
resource "aws_iam_policy" "prow_cluster_maintainer" {
name = "ProwClusterMaintainer"
path = "/"
policy = data.aws_iam_policy_document.prow_cluster_maintainer.json
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like this policy allows escalation to AdministratorAccess. We could use that (AdministratorAccess) policy instead for now; it's simpler, and more obvious that the policy carries a risk.

Longer term, we might want to further limit the privileges that we provide here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you help me understand how the policy allows it to be escalated to administrator access?

Copy link
Member Author

@xmudrii xmudrii Apr 24, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We discussed using permissions boundary at KubeCon. @pkprzekwas will follow up on and create tracking issues.

"iam:DeletePolicy",
"iam:DeleteRole",
"iam:DeleteRolePolicy",
"iam:DetachRolePolicy",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bear in mind that this can remove deny rules. A comment to that effect might be helpful.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A comment will be added.

Signed-off-by: Marko Mudrinić <[email protected]>
Signed-off-by: Marko Mudrinić <[email protected]>
@k8s-ci-robot k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Apr 24, 2023
Signed-off-by: Marko Mudrinić <[email protected]>
@pkprzekwas
Copy link
Contributor

/lgtm
I think it's fine to merge it now. IAM concerns will be addressed in a separate issue.

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Apr 24, 2023
@dims
Copy link
Member

dims commented Apr 24, 2023

thanks @pkprzekwas

/approve
/lgtm

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: dims, xmudrii

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@xmudrii
Copy link
Member Author

xmudrii commented Apr 24, 2023

I'm applying this PR to the canary cluster, but suddenly Terraform wants to update this resource:

  # module.eks.aws_iam_openid_connect_provider.oidc_provider[0] will be updated in-place
  ~ resource "aws_iam_openid_connect_provider" "oidc_provider" {
        id              = "arn:aws:iam::468814281478:oidc-provider/oidc.eks.us-east-2.amazonaws.com/id/BE445C1D51F231A55B2A0E3D0140D458"
        tags            = {
            "Name" = "prow-build-canary-cluster-eks-irsa"
        }
      ~ thumbprint_list = [
            # (2 unchanged elements hidden)
            "414a2060b738c635cc7fc243e052615592830c53",
          - "53011a7515ca46ed6233168766a0f1729608be0e",
          + "50879ea7f7c29dd615269e559fb061b46bdd3dbe",
        ]
        # (4 unchanged attributes hidden)
    }

As per this Slack thread, we'll proceed with this update.

Signed-off-by: Marko Mudrinić <[email protected]>
@k8s-ci-robot k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Apr 24, 2023
@xmudrii
Copy link
Member Author

xmudrii commented Apr 24, 2023

I had to add some additional permissions in 20d1da9 to be able to reconcile #5113 (comment)

@pkprzekwas
Copy link
Contributor

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Apr 24, 2023
@pkprzekwas
Copy link
Contributor

Further IAM improvements will be introduced with #5160

@xmudrii
Copy link
Member Author

xmudrii commented Apr 24, 2023

Applied to both prod and canary
/hold cancel

@k8s-ci-robot k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Apr 24, 2023
@k8s-ci-robot k8s-ci-robot merged commit ff0e43e into kubernetes:main Apr 24, 2023
@k8s-ci-robot k8s-ci-robot added this to the v1.28 milestone Apr 24, 2023
@xmudrii xmudrii deleted the eks-prow-cluster-iam branch April 24, 2023 17:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/infra/aws Issues or PRs related to Kubernetes AWS infrastructure area/infra Infrastructure management, infrastructure design, code in infra/ cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. sig/k8s-infra Categorizes an issue or PR as relevant to SIG K8s Infra. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants