Skip to content

Documentation for creating a new cluster on a different AWS account #728

Merged
k8s-ci-robot merged 7 commits intokubernetes-sigs:masterfrom
harishspqr:master
Apr 25, 2019
Merged

Documentation for creating a new cluster on a different AWS account #728
k8s-ci-robot merged 7 commits intokubernetes-sigs:masterfrom
harishspqr:master

Conversation

@harishspqr
Copy link
Copy Markdown
Contributor

@harishspqr harishspqr commented Apr 19, 2019

What this PR does / why we need it:
Documentation for creating a new cluster on a different AWS account than the one where the CAPA controller is running on by using cross account role assumption with KIAM. This is added as its a common use-case for a system to create a new cluster on behalf of another person/customer.

Which issue(s) this PR fixes *(optional, in fixes #<issue number>(, fixes #<issue_number>, ...)

NONE

Special notes for your reviewer:
NONE

Release note:

NONE

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Apr 19, 2019
@k8s-ci-robot k8s-ci-robot added the needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. label Apr 19, 2019
@k8s-ci-robot
Copy link
Copy Markdown
Contributor

Hi @harishspqr. Thanks for your PR.

I'm waiting for a kubernetes-sigs or kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Apr 19, 2019
Copy link
Copy Markdown
Contributor

@chuckha chuckha left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't get all of them, but adding whitespace around code blocks is good for md parsers

Mostly formatting feedback, I can do a second pass later on

Thanks so much for this work!

Comment thread docs/roleassumption.md Outdated
Comment thread docs/roleassumption.md Outdated
Comment thread docs/roleassumption.md Outdated
Comment thread docs/roleassumption.md Outdated
Comment thread docs/roleassumption.md Outdated
Comment thread docs/roleassumption.md
Comment thread docs/roleassumption.md
Comment thread docs/roleassumption.md
Comment thread docs/roleassumption.md
Comment thread docs/roleassumption.md Outdated
@chuckha
Copy link
Copy Markdown
Contributor

chuckha commented Apr 19, 2019

/ok-to-test

@k8s-ci-robot k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Apr 19, 2019
Copy link
Copy Markdown
Contributor

@detiber detiber left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few nits, otherwise looks good to me.

/assign @randomvariable
Assigning Naadir, since he has more experience with kiam to review the actual configs.

Comment thread docs/roleassumption.md Outdated
Comment thread docs/roleassumption.md Outdated
Comment thread docs/roleassumption.md Outdated
Comment thread docs/roleassumption.md Outdated
Comment thread docs/roleassumption.md Outdated
Comment thread docs/roleassumption.md Outdated
Comment thread docs/roleassumption.md Outdated
Copy link
Copy Markdown

@AlainRoy AlainRoy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work! Just a few picky nits from me.

Comment thread docs/roleassumption.md Outdated
Comment thread docs/roleassumption.md Outdated
Comment thread docs/roleassumption.md Outdated
Comment thread docs/roleassumption.md Outdated
Comment thread docs/roleassumption.md
Comment thread docs/roleassumption.md Outdated
Comment thread docs/roleassumption.md Outdated
Comment thread docs/roleassumption.md Outdated
Comment thread docs/roleassumption.md Outdated
Comment thread docs/roleassumption.md
Comment thread docs/roleassumption.md Outdated
@harishspqr
Copy link
Copy Markdown
Contributor Author

Great comments from everyone.. Thanks guys!

Copy link
Copy Markdown
Contributor

@chuckha chuckha left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My only main concern is with the use of bootstrap vs management cluster

looking really great though, good updates

Comment thread docs/roleassumption.md Outdated
Comment thread docs/roleassumption.md
Comment thread docs/roleassumption.md Outdated
Comment thread docs/roleassumption.md Outdated
Comment thread docs/roleassumption.md Outdated
Comment thread docs/roleassumption.md Outdated
Comment thread docs/roleassumption.md Outdated
Comment thread docs/roleassumption.md Outdated
Comment thread docs/roleassumption.md Outdated
Comment thread docs/roleassumption.md Outdated
Comment thread docs/roleassumption.md Outdated
Comment thread docs/roleassumption.md Outdated
Comment thread docs/roleassumption.md Outdated
Comment thread docs/roleassumption.md Outdated
Comment thread docs/roleassumption.md Outdated
Comment thread docs/roleassumption.md Outdated
@ashish-amarnath
Copy link
Copy Markdown
Contributor

/test pull-cluster-api-provider-aws-bazel-integration

@k8s-ci-robot
Copy link
Copy Markdown
Contributor

@harishspqr: The following test failed, say /retest to rerun them all:

Test name Commit Details Rerun command
pull-cluster-api-provider-aws-bazel-integration 846cf29 link /test pull-cluster-api-provider-aws-bazel-integration

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@detiber
Copy link
Copy Markdown
Contributor

detiber commented Apr 25, 2019

/lgtm
/approve

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Apr 25, 2019
@k8s-ci-robot
Copy link
Copy Markdown
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: detiber, harishspqr

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Apr 25, 2019
@k8s-ci-robot k8s-ci-robot merged commit cd4f48a into kubernetes-sigs:master Apr 25, 2019
detiber pushed a commit to detiber/cluster-api-provider-aws that referenced this pull request May 2, 2019
…ubernetes-sigs#728)

* Initial draft of documentation for Cluster creation using cross account role assumption

* Update roleassumption.md

Complete the document.

* cleanup the documentation for roleassumption

* Resolved the comments: role assumption documentation.

* Fix minor issues - roleassumption.md

* resolve more comments to roleassumption.md

* Resolve more comments - roleassumption.md
k8s-ci-robot pushed a commit that referenced this pull request May 2, 2019
* Update the releasing docs (#689)

* Add error reason to output if fail to checkout an account from boskos (#698)

* Temporary workaround a data issue in boskos service (#699)

* Update checkout_account.py to not reuse connections (#700)

* Fix checkout_account.py (#702)

* Make hack/checkin_account.py executable (#703)

* Fix: all traffic ingress rule triggers fatal nil dereference (#697)

* fix: respect all traffic security group rules (and others)

For anything besides tcp, udp, icmp, and icmpv6 there is no applicable
notion of "port range." AWS omits FromPort and ToPort in its responses,
causing a fatal nil dereference when attempting to read any security
groups with e.g. an "all traffic" rule.

* fix: omit description when empty string

* fix: handle more security groups without crashing

This commit cleans up and clarifies a few of the less obvious components
of the previous work.

* fix: handle more security groups without crashing

Address linter failures.

* fix: handle more security groups without crashing

Usage needs to match declaration. Computers are sticklers about that
sort of thing.

* fix: handle more security groups without crashing

Add clarifying comment to serializer function.

* Fixes a bug and adds tests for kubeadm defaults (#707)

The pointers were not working as expected so the API is changing
to be more functional and leverage kubernetes' DeepCopy function.

* Update listed v1.14 AMIs to v1.14.1 (#708)

* Update listed v1.14 AMIs to v1.14.1

* Update README with list of published AMIs/Kubernetes versions

* GZIP user-data (#710)

Signed-off-by: Vince Prignano <vincepri@vmware.com>

* Make sure Calico can talk IP-in-IP (#701)

* MAke sure Calico can talk IP-in-IP

* Add IP in IP protocol to the control plane security group

* Add IPv4 protocol definition and make sure it's handled properly.

* Make port ranges AWS complient and security groups more restrictive.

* Fix security groups

* Adds tests to kubeadm defaults (#709)

Attempt at documenting the assumptions made in the kubeadm
defaults code.

Signed-off-by: Chuck Ha <chuckh@vmware.com>

* Logging (#713)

* Adds logr as dependency

Signed-off-by: Chuck Ha <chuckh@vmware.com>

* Use logr in the cluster actuator

This only creates the logger. Does not yet swap out actual klog calls.

Signed-off-by: Chuck Ha <chuckh@vmware.com>

* update bazel

Signed-off-by: Chuck Ha <chuckh@vmware.com>

* update

Signed-off-by: Chuck Ha <chuckh@vmware.com>

* Switch dep to use release-0.1 branch instead of version (#715)

* Adds logr as dependency (#714)

Adds context for logs and removes excessive logging

Signed-off-by: Chuck Ha <chuckh@vmware.com>

* Ensure `make manifests` generates machines file for HA control plane too. (#720)

* Add HA machines template

* Introduce HA machines file in `make manifests` target

* Add clusterawsadm as make dependency to manifests make target. (#721)

Ensures manifests are generated from the current state of the source.
Assuming $GOPATH/bin is in the $PATH

* Update to Go 1.12 (#719)

Signed-off-by: Vince Prignano <vincepri@vmware.com>

* Add ability to override Organization ID for image lookups (#723)

* Add ability to override Organization ID for image lookups

* Update pkg/cloud/aws/services/ec2/ami.go

Co-Authored-By: detiber <detiberusj@vmware.com>

* Add updated generated crd

* feat: support customizing root device size (#718)

* feat: support customizing root device size

* chore: re-generate CRDs

* fix: update formatting

* chore: add comment describing Service.sdkToInstance

* chore: make service.SDKToInstance public

* Rename BUILD -> BUILD.bazel for consistency (#724)

find . -type file -name BUILD -not -path "./vendor/*" | xargs -n1 -I{} -- git mv {} {}.bazel

Preferred build name changed in 3788fb1
Fixes #722

* Adds retry-on-conflict during updates (#725)

* Adds retry-on-conflict during updates

Signed-off-by: Chuck Ha <chuckh@vmware.com>

* adds note about status update caveat

Signed-off-by: Chuck Ha <chuckh@vmware.com>

* clarify errors/comments

Signed-off-by: Chuck Ha <chuckh@vmware.com>

* Add the HA machines configuration to bazel (#733)

Signed-off-by: Chuck Ha <chuckh@vmware.com>

* Ensure bazel is the correct version (#731)

Signed-off-by: Chuck Ha <chuckh@vmware.com>

* Update OWNERS_ALIASES and SECURITY_CONTACTS (#712)

* Fix the prow jobs (#735)

Signed-off-by: Chuck Ha <chuckh@vmware.com>

* Fix markdown formatting (#736)

* extract fmt from release tool (#738)

Signed-off-by: Chuck Ha <chuckh@vmware.com>

* Use DEFAULT_REGION as the default and REGION as the supplied (#739)

Signed-off-by: Chuck Ha <chuckh@vmware.com>

* e2e testing improvement (#743)

* Bump kind version
* Remove docker load in favor of kind load for e2e cluster

Signed-off-by: Chuck Ha <chuckh@vmware.com>

* fix: Don't try to update root size when it's unset (#726)

* fix: Don't try to update root size when it's unset

This commit looks for empty RootDeviceSize in the spec and ignores it.
Otherwise, none of our control plane machines were updating with this
error:

```
E0418 23:07:48.250925       1 controller.go:214] Error updating machine "ns/controlplane-2": found attempt to change immutable state for machine "controlplane-2": ["Root volume size cannot be mutated from 8 to 0"]
```

* fix: updates without specifying a root volume size

Add unit test.

* fix: updates without specifying a root volume size

Fix gofmt.

* Scope nodeRef to workload cluster (#744)

Signed-off-by: Vince Prignano <vincepri@vmware.com>

* Fix NPE on delete bastion host (#746)

Signed-off-by: Vince Prignano <vincepri@vmware.com>

* Documentation for creating a new cluster on a different AWS account  (#728)

* Initial draft of documentation for Cluster creation using cross account role assumption

* Update roleassumption.md

Complete the document.

* cleanup the documentation for roleassumption

* Resolved the comments: role assumption documentation.

* Fix minor issues - roleassumption.md

* resolve more comments to roleassumption.md

* Resolve more comments - roleassumption.md

* include machines-ha.yaml.template in release artifacts (#741)

* Update AWS sdk, improve log in machine actuator delete (#747)

Signed-off-by: Vince Prignano <vincepri@vmware.com>

* Fixes the infinite reconcile loop (#748)

* Uses patch for updating the cluster and machine specs
  - patch does not cause a re-reconcile in the capi controller
* Uses update for updating the cluster and machine status
  - update for status is ok since it does not update any of the metadata
    no re-reconcile is necessary for the capi controller

Signed-off-by: Chuck Ha <chuckh@vmware.com>

* Update Gopkg.lock and cleanup Makefile (#751)

* Update cluster-api release-0.1 vendor (#750)

Signed-off-by: Vince Prignano <vincepri@vmware.com>

* Reduce the number of re-reconciles (#752)

Signed-off-by: Chuck Ha <chuckh@vmware.com>
richardchen-db pushed a commit to databricks/cluster-api-provider-aws-1 that referenced this pull request Jan 14, 2023
* Update the releasing docs (kubernetes-sigs#689)

* Add error reason to output if fail to checkout an account from boskos (kubernetes-sigs#698)

* Temporary workaround a data issue in boskos service (kubernetes-sigs#699)

* Update checkout_account.py to not reuse connections (kubernetes-sigs#700)

* Fix checkout_account.py (kubernetes-sigs#702)

* Make hack/checkin_account.py executable (kubernetes-sigs#703)

* Fix: all traffic ingress rule triggers fatal nil dereference (kubernetes-sigs#697)

* fix: respect all traffic security group rules (and others)

For anything besides tcp, udp, icmp, and icmpv6 there is no applicable
notion of "port range." AWS omits FromPort and ToPort in its responses,
causing a fatal nil dereference when attempting to read any security
groups with e.g. an "all traffic" rule.

* fix: omit description when empty string

* fix: handle more security groups without crashing

This commit cleans up and clarifies a few of the less obvious components
of the previous work.

* fix: handle more security groups without crashing

Address linter failures.

* fix: handle more security groups without crashing

Usage needs to match declaration. Computers are sticklers about that
sort of thing.

* fix: handle more security groups without crashing

Add clarifying comment to serializer function.

* Fixes a bug and adds tests for kubeadm defaults (kubernetes-sigs#707)

The pointers were not working as expected so the API is changing
to be more functional and leverage kubernetes' DeepCopy function.

* Update listed v1.14 AMIs to v1.14.1 (kubernetes-sigs#708)

* Update listed v1.14 AMIs to v1.14.1

* Update README with list of published AMIs/Kubernetes versions

* GZIP user-data (kubernetes-sigs#710)

Signed-off-by: Vince Prignano <vincepri@vmware.com>

* Make sure Calico can talk IP-in-IP (kubernetes-sigs#701)

* MAke sure Calico can talk IP-in-IP

* Add IP in IP protocol to the control plane security group

* Add IPv4 protocol definition and make sure it's handled properly.

* Make port ranges AWS complient and security groups more restrictive.

* Fix security groups

* Adds tests to kubeadm defaults (kubernetes-sigs#709)

Attempt at documenting the assumptions made in the kubeadm
defaults code.

Signed-off-by: Chuck Ha <chuckh@vmware.com>

* Logging (kubernetes-sigs#713)

* Adds logr as dependency

Signed-off-by: Chuck Ha <chuckh@vmware.com>

* Use logr in the cluster actuator

This only creates the logger. Does not yet swap out actual klog calls.

Signed-off-by: Chuck Ha <chuckh@vmware.com>

* update bazel

Signed-off-by: Chuck Ha <chuckh@vmware.com>

* update

Signed-off-by: Chuck Ha <chuckh@vmware.com>

* Switch dep to use release-0.1 branch instead of version (kubernetes-sigs#715)

* Adds logr as dependency (kubernetes-sigs#714)

Adds context for logs and removes excessive logging

Signed-off-by: Chuck Ha <chuckh@vmware.com>

* Ensure `make manifests` generates machines file for HA control plane too. (kubernetes-sigs#720)

* Add HA machines template

* Introduce HA machines file in `make manifests` target

* Add clusterawsadm as make dependency to manifests make target. (kubernetes-sigs#721)

Ensures manifests are generated from the current state of the source.
Assuming $GOPATH/bin is in the $PATH

* Update to Go 1.12 (kubernetes-sigs#719)

Signed-off-by: Vince Prignano <vincepri@vmware.com>

* Add ability to override Organization ID for image lookups (kubernetes-sigs#723)

* Add ability to override Organization ID for image lookups

* Update pkg/cloud/aws/services/ec2/ami.go

Co-Authored-By: detiber <detiberusj@vmware.com>

* Add updated generated crd

* feat: support customizing root device size (kubernetes-sigs#718)

* feat: support customizing root device size

* chore: re-generate CRDs

* fix: update formatting

* chore: add comment describing Service.sdkToInstance

* chore: make service.SDKToInstance public

* Rename BUILD -> BUILD.bazel for consistency (kubernetes-sigs#724)

find . -type file -name BUILD -not -path "./vendor/*" | xargs -n1 -I{} -- git mv {} {}.bazel

Preferred build name changed in 3788fb1
Fixes kubernetes-sigs#722

* Adds retry-on-conflict during updates (kubernetes-sigs#725)

* Adds retry-on-conflict during updates

Signed-off-by: Chuck Ha <chuckh@vmware.com>

* adds note about status update caveat

Signed-off-by: Chuck Ha <chuckh@vmware.com>

* clarify errors/comments

Signed-off-by: Chuck Ha <chuckh@vmware.com>

* Add the HA machines configuration to bazel (kubernetes-sigs#733)

Signed-off-by: Chuck Ha <chuckh@vmware.com>

* Ensure bazel is the correct version (kubernetes-sigs#731)

Signed-off-by: Chuck Ha <chuckh@vmware.com>

* Update OWNERS_ALIASES and SECURITY_CONTACTS (kubernetes-sigs#712)

* Fix the prow jobs (kubernetes-sigs#735)

Signed-off-by: Chuck Ha <chuckh@vmware.com>

* Fix markdown formatting (kubernetes-sigs#736)

* extract fmt from release tool (kubernetes-sigs#738)

Signed-off-by: Chuck Ha <chuckh@vmware.com>

* Use DEFAULT_REGION as the default and REGION as the supplied (kubernetes-sigs#739)

Signed-off-by: Chuck Ha <chuckh@vmware.com>

* e2e testing improvement (kubernetes-sigs#743)

* Bump kind version
* Remove docker load in favor of kind load for e2e cluster

Signed-off-by: Chuck Ha <chuckh@vmware.com>

* fix: Don't try to update root size when it's unset (kubernetes-sigs#726)

* fix: Don't try to update root size when it's unset

This commit looks for empty RootDeviceSize in the spec and ignores it.
Otherwise, none of our control plane machines were updating with this
error:

```
E0418 23:07:48.250925       1 controller.go:214] Error updating machine "ns/controlplane-2": found attempt to change immutable state for machine "controlplane-2": ["Root volume size cannot be mutated from 8 to 0"]
```

* fix: updates without specifying a root volume size

Add unit test.

* fix: updates without specifying a root volume size

Fix gofmt.

* Scope nodeRef to workload cluster (kubernetes-sigs#744)

Signed-off-by: Vince Prignano <vincepri@vmware.com>

* Fix NPE on delete bastion host (kubernetes-sigs#746)

Signed-off-by: Vince Prignano <vincepri@vmware.com>

* Documentation for creating a new cluster on a different AWS account  (kubernetes-sigs#728)

* Initial draft of documentation for Cluster creation using cross account role assumption

* Update roleassumption.md

Complete the document.

* cleanup the documentation for roleassumption

* Resolved the comments: role assumption documentation.

* Fix minor issues - roleassumption.md

* resolve more comments to roleassumption.md

* Resolve more comments - roleassumption.md

* include machines-ha.yaml.template in release artifacts (kubernetes-sigs#741)

* Update AWS sdk, improve log in machine actuator delete (kubernetes-sigs#747)

Signed-off-by: Vince Prignano <vincepri@vmware.com>

* Fixes the infinite reconcile loop (kubernetes-sigs#748)

* Uses patch for updating the cluster and machine specs
  - patch does not cause a re-reconcile in the capi controller
* Uses update for updating the cluster and machine status
  - update for status is ok since it does not update any of the metadata
    no re-reconcile is necessary for the capi controller

Signed-off-by: Chuck Ha <chuckh@vmware.com>

* Update Gopkg.lock and cleanup Makefile (kubernetes-sigs#751)

* Update cluster-api release-0.1 vendor (kubernetes-sigs#750)

Signed-off-by: Vince Prignano <vincepri@vmware.com>

* Reduce the number of re-reconciles (kubernetes-sigs#752)

Signed-off-by: Chuck Ha <chuckh@vmware.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants