Skip to content

Hive 2485/mce 2.4: Backport AssumeRole, credential_process, and kubeconfig exec fixes#2400

Merged
openshift-merge-bot[bot] merged 4 commits intoopenshift:mce-2.4from
2uasimojo:HIVE-2485/mce-2.4
Aug 27, 2024
Merged

Hive 2485/mce 2.4: Backport AssumeRole, credential_process, and kubeconfig exec fixes#2400
openshift-merge-bot[bot] merged 4 commits intoopenshift:mce-2.4from
2uasimojo:HIVE-2485/mce-2.4

Conversation

@2uasimojo
Copy link
Member

@2uasimojo 2uasimojo commented Aug 2, 2024

Almost perfectly clean backport of #2391. Just imports in pkg/install/generate.go needed cleanup.

HIVE-2485

Simplify the AssumeRole flow: Rather than doing it via
`credential_process` as a callback from within the creds file used by
the provision pod, flatten this out so the AssumeRole is done implicitly
by the AWS SDK.

This flow remains unchanged:

The clusterdeployment controller:
- Copies the service provider secret into the CD namespace
- Creates an AWS credentials secret
- Creates the provision pod

The provision pod:
- Loads the credentials secret
- Projects the AWS config therein onto the file system
- Invokes the installer

The installer:
- Creates an AWS client using that config file
- Proceeds with installation

Before this commit:
The AWS config contained a `credential_process` which invoked
`hiveutil install-manager aws-credentials` which...
- Loaded the service provider secret
- Created an AWS client
- Used the client to AssumeRole and generate credentials with a 15m
expiration
- Printed the credentials to stdout in the format expected by AWS.

Per AWS docs[1], the SDK will automatically rerun the
`credential_process` before the expiration time to refresh the creds.

With this commit:
The clusterdeployment controller loads the service provider secret and
folds it into the AWS config as a separate profile, referenced from the
default via `source_profile`:

```
[default]
source_profile = source
role_arn = arn:aws:iam::123456789012:role/assume-role-customer

[profile source]
aws_access_key_id: ABCDEFGHIJKLMNOPQRST
aws_secret_access_key: 1234567890abcdefghijklmnopqrstuvwxyz0123
role_arn = arn:aws:iam::210987654321:role/assume-role-provider
```

Per AWS docs[2], the SDK will use the source creds to AssumeRole to
generate temporary creds, which it will automatically refresh as they
expire -- i.e. natively performing the same function as `hiveutil
install-manager aws-credentials`.

[1] https://docs.aws.amazon.com/cli/v1/userguide/cli-configure-sourcing-external.html
[2] https://docs.aws.amazon.com/cli/latest/userguide/cli-configure-role.html

HIVE-2485
HIVE-2529

(cherry picked from commit 8f11ce3)

Conflicts:
        pkg/install/generate.go (hiveutil binary path changed, no longer
        relevant.)
(cherry picked from commit a64f325)
As a security measure, check AWS config/credential files for
`credential_process`, and explode if found.

We used to use `credential_process` deliberately to AssumeRole for STS
clusters. A prior commit switched this over to use a different
mechanism, but existing clusters in the field may still be configured
with the old mechanism in the relevant Secrets. Convert such Secrets to
use the new mechanism.

HIVE-2485

(cherry picked from commit 13ea4f4)
(cherry picked from commit bc783e2)
A previous commit (openshift#2306 / 13ea4f4) put in checks to forbid the use of
`credential_process` in AWS config/credentials files. It turns out that
AWS accepts this key case-insensitively, so this commit updates our
checks accordingly.

HIVE-2485

(cherry picked from commit 229f705)
(cherry picked from commit 3fc318b)
Users with write access to the admin kubeconfig Secret for a given
ClusterDeployment should not be able to execute arbitrary code in the
privileged environment in which we run the controllers that use those
Secrets. Funnel all code paths that load such Secrets through a
validator to ensure that the AuthInfos[].Exec path is not used.

HIVE-2485

(cherry picked from commit df1ea18)
(cherry picked from commit b9d2ed9)
@openshift-ci openshift-ci bot requested review from jstuever and lleshchi August 2, 2024 21:28
@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Aug 2, 2024
@2uasimojo
Copy link
Member Author

/override ci/prow/security

Will be addressed by backport of #2387

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Aug 2, 2024

@2uasimojo: Overrode contexts on behalf of 2uasimojo: ci/prow/security

Details

In response to this:

/override ci/prow/security

Will be addressed by backport of #2387

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@codecov
Copy link

codecov bot commented Aug 3, 2024

Codecov Report

Attention: Patch coverage is 27.42857% with 127 lines in your changes missing coverage. Please review.

Project coverage is 57.58%. Comparing base (ca97367) to head (4d38f54).
Report is 7 commits behind head on mce-2.4.

Files Patch % Lines
pkg/install/generate.go 2.08% 47 Missing ⚠️
contrib/pkg/utils/generic.go 0.00% 18 Missing ⚠️
pkg/controller/utils/secrets.go 61.76% 8 Missing and 5 partials ⚠️
pkg/awsclient/client.go 0.00% 12 Missing ⚠️
pkg/installmanager/installmanager.go 0.00% 9 Missing ⚠️
contrib/pkg/utils/aws/aws.go 0.00% 6 Missing ⚠️
.../controller/clusterdeployment/clusterprovisions.go 33.33% 4 Missing ⚠️
...lusterdeprovision/clusterdeprovision_controller.go 20.00% 4 Missing ⚠️
contrib/pkg/utils/openstack/openstack.go 0.00% 2 Missing ⚠️
contrib/pkg/utils/ovirt/ovirt.go 0.00% 2 Missing ⚠️
... and 9 more
Additional details and impacted files

Impacted file tree graph

@@             Coverage Diff             @@
##           mce-2.4    #2400      +/-   ##
===========================================
- Coverage    57.59%   57.58%   -0.02%     
===========================================
  Files          187      186       -1     
  Lines        25851    25885      +34     
===========================================
+ Hits         14889    14905      +16     
- Misses        9713     9734      +21     
+ Partials      1249     1246       -3     
Files Coverage Δ
...roller/argocdregister/argocdregister_controller.go 50.50% <100.00%> (-0.22%) ⬇️
...roller/awsprivatelink/awsprivatelink_controller.go 68.09% <100.00%> (+0.27%) ⬆️
...controller/clusterclaim/clusterclaim_controller.go 63.59% <100.00%> (ø)
.../clusterdeployment/clusterdeployment_controller.go 64.70% <100.00%> (+0.13%) ⬆️
...kg/controller/clusterdeployment/clusterinstalls.go 75.23% <100.00%> (ø)
pkg/remoteclient/remoteclient.go 75.37% <100.00%> (+1.24%) ⬆️
contrib/pkg/utils/azure/azure.go 0.00% <0.00%> (ø)
contrib/pkg/utils/gcp/gcp.go 0.00% <0.00%> (ø)
pkg/controller/awsprivatelink/cleanup.go 46.10% <50.00%> (ø)
pkg/controller/clusterdeprovision/awsactuator.go 34.48% <0.00%> (ø)
... and 15 more

... and 1 file with indirect coverage changes

@2uasimojo
Copy link
Member Author

/test e2e-azure

1 similar comment
@2uasimojo
Copy link
Member Author

/test e2e-azure

@2uasimojo
Copy link
Member Author

/test e2e-azure

Rate limiting

@2uasimojo
Copy link
Member Author

/test e2e-azure

time="2024-08-03T00:00:28.887Z" level=error msg="could not generateMachineSets" controller=machinepool error="could not generate machinesets: error retrieving VM capabilities: not found in region centralus" machinePool=cluster-test/hive-24c21d3a-01c5-4555-a6de-05c0d3caf006-worker reconcileID=q8mfbmvc

Doesn't seem like a thing?

@2uasimojo
Copy link
Member Author

/assign @jstuever

@jstuever
Copy link
Contributor

/lgtm

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Aug 27, 2024
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Aug 27, 2024

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: 2uasimojo, jstuever

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci-robot
Copy link

/retest-required

Remaining retests: 0 against base HEAD 47bf8a8 and 2 for PR HEAD 4d38f54 in total

2 similar comments
@openshift-ci-robot
Copy link

/retest-required

Remaining retests: 0 against base HEAD 47bf8a8 and 2 for PR HEAD 4d38f54 in total

@openshift-ci-robot
Copy link

/retest-required

Remaining retests: 0 against base HEAD 47bf8a8 and 2 for PR HEAD 4d38f54 in total

@2uasimojo
Copy link
Member Author

/override ci/prow/security

Backport of #2387

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Aug 27, 2024

@2uasimojo: Overrode contexts on behalf of 2uasimojo: ci/prow/security

Details

In response to this:

/override ci/prow/security

Backport of #2387

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@openshift-merge-bot openshift-merge-bot bot merged commit 85b9f09 into openshift:mce-2.4 Aug 27, 2024
@2uasimojo 2uasimojo deleted the HIVE-2485/mce-2.4 branch August 27, 2024 21:21
@2uasimojo
Copy link
Member Author

/cherry-pick mce-2.3

@openshift-cherrypick-robot

@2uasimojo: #2400 failed to apply on top of branch "mce-2.3":

Applying: Replumb AssumeRole (AWS)
Using index info to reconstruct a base tree...
M	contrib/pkg/utils/aws/aws.go
M	pkg/controller/clusterdeployment/clusterprovisions.go
M	pkg/controller/clusterdeprovision/clusterdeprovision_controller.go
M	pkg/install/generate.go
M	pkg/installmanager/aws_credentials_test.go
M	pkg/installmanager/installmanager.go
Falling back to patching base and 3-way merge...
Auto-merging pkg/installmanager/installmanager.go
CONFLICT (modify/delete): pkg/installmanager/aws_credentials_test.go deleted in Replumb AssumeRole (AWS) and modified in HEAD. Version HEAD of pkg/installmanager/aws_credentials_test.go left in tree.
Removing pkg/installmanager/aws_credentials.go
Auto-merging pkg/install/generate.go
Auto-merging pkg/controller/clusterdeprovision/clusterdeprovision_controller.go
Auto-merging pkg/controller/clusterdeployment/clusterprovisions.go
Auto-merging contrib/pkg/utils/aws/aws.go
error: Failed to merge in the changes.
hint: Use 'git am --show-current-patch=diff' to see the failed patch
Patch failed at 0001 Replumb AssumeRole (AWS)
When you have resolved this problem, run "git am --continue".
If you prefer to skip this patch, run "git am --skip" instead.
To restore the original branch and stop patching, run "git am --abort".

Details

In response to this:

/cherry-pick mce-2.3

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants