Skip to content
This repository was archived by the owner on Apr 13, 2020. It is now read-only.

Commit ca6f7e9

Browse files
Bedrock Manage Identity Environment and Testing docs (#481)
* manage identity test docs * fixing tf module * Update managed-identity.md * sarath changes * erik changes Co-authored-by: Bhargav Nookala <[email protected]>
1 parent b46cb5c commit ca6f7e9

File tree

2 files changed

+259
-0
lines changed

2 files changed

+259
-0
lines changed
164 KB
Loading
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,259 @@
1+
# MSI Support Testing for Bedrock AKS-gitops
2+
3+
| Revision | Date | Author | Remarks |
4+
| -------: | ------------ | -------------- | ------------- |
5+
| 0.1 | Mar-30, 2020 | Nathaniel Rose | Initial Draft |
6+
7+
## 1. Overview
8+
9+
Managed Identities for Azure resources provides Azure services with an
10+
automatically managed identity in Azure AD. You can use the identity to
11+
authenticate to any service that supports Azure AD authentication, including Key
12+
Vault, without any credentials in your code. Terraform can be configured to use
13+
managed identity for authentication in one of two ways: using environment
14+
variables, or by defining the fields within the provider block.
15+
16+
AKS creates two managed identities:
17+
18+
- System-assigned managed identity: The identity that the Kubernetes cloud
19+
provider uses to create Azure resources on behalf of the user.
20+
21+
- User-assigned managed identity: The identity that's used for authorization in
22+
the cluster.
23+
24+
This document outlines a testing suite to support feature related support for
25+
managed identities in AKS using a proposed new Bedrock environment that
26+
leverages a modified cobalt project test harness in order for test pod identity
27+
within an AKS cluster using agile CI/CD and test validation.
28+
29+
### Scenarios Addressed:
30+
31+
1. [As an SRE, I want Enable MSI Support for aks-gitops module](https://github.com/microsoft/bedrock/issues/994)
32+
2. [As an Operator, I want automated testing validation for MSI verified within Bedrock](https://github.com/microsoft/bedrock/issues/1197)
33+
3. [As an operator, I want integration Tests tracking with junit logs from terratest](https://github.com/microsoft/bedrock/issues/867)
34+
4. [As an operator, I want to implement a managed service identity (via AAD Pod Identity) based secret handling strategy](https://github.com/microsoft/bedrock/issues/482)
35+
36+
## 2. Out of Scope
37+
38+
An existing pull request for Bedrock currently exists that enables MSI support
39+
for aks-gitops modules [#995](https://github.com/microsoft/bedrock/pull/995).
40+
This design document seeks to solely capture a terraform template and
41+
complementary test.
42+
43+
The following are not included in this proposal:
44+
45+
- Mocking for Terraform Unit Tests
46+
- Feature revert and Rollback from failed merges
47+
- Adjusting Cobalt Test Fixture support for current file organization of
48+
Bedrock: i.e.: testing files in respective folders for template environments.
49+
50+
## 3. Design Details
51+
52+
This design seeks to introduce modular testing for terraform known as
53+
`Test Fixtures` based on best practices initially introduced by
54+
[Project Cobalt](github.com/microsoft/cobalt). The test fixtures decouples
55+
terraform commands to respective pipeline templats to be called and dynamically
56+
populated by a targeted template test.
57+
58+
### 3.1 Embed new Infrastructure DevOps Model Flow - Continuous Integration
59+
60+
Bedrock infrastructure integration tests have problematic gaps that do not
61+
account for terraform unit testing, state validation to live environments and
62+
staged release management for Bedrock versioning. Bedrock test harness does not
63+
contain module targeted fail fast resource definition validation outside the
64+
scope of an environment `terraform plan`. In addition, integration tests are
65+
validated through new deployments that require extensive time to provision.
66+
Furthermore, releases of features contain no issue reporting benchmark,
67+
automated deployment validation, or guidance process for merging into master. In
68+
this design we wish to provide a single template leveraging MSI that verifies a
69+
new Infrastructure Testing Workflow that improves on the current Bedrock test
70+
harness.
71+
72+
This design is intended to address expected core testing functionality
73+
including:
74+
75+
- Support deployment of application-hosting infrastructure that will eventually
76+
house the actual application service components capture basic metrics and
77+
telemetry from the deployment process for monitoring of ongoing pipeline
78+
performance and diagnosis of any deployment failures
79+
- Support deployment into multiple staging environments
80+
- Execute automated unit-level and integration-level tests against the
81+
resources, prior to deployment into any long-living environments
82+
- Provide a manual approval process to gate deployment into long-living
83+
environments
84+
- Provide detection, abort, and reporting of deployment status when a failure
85+
occurs.
86+
87+
![](infratestflow.png)
88+
89+
The proposed new Infrastructure Devops Flow for Terraform Testing can be
90+
separated by 4 key steps:
91+
92+
1. Test Suite Initialization - Provisioning global artifacts, secrets and
93+
dependencies needed for targeted whitelisted test matrix.
94+
2. Static Validation - Environment initialization, code validation, inspection,
95+
terraform security compliance, and terraform module unit tests.
96+
3. Dynamic Validation - Targeted environment interoperability, integration
97+
tests, cloud deployment, de-provisioning of resources, error reporting.
98+
4. QA- Peer approval, release management, feature staging, acceptance test
99+
within live cluster.
100+
101+
> The diagram above contains green check marks that indicate preexisting Bedrock
102+
> testing components that are already implemented through the current test
103+
> harness.
104+
105+
### 3.2 Creation of Managed Identity enable AKS Gitops Environments
106+
107+
A new AKS Bedrock template with Managed Identity enabled, (`azure-MI`), will be
108+
added to the collection of environment templates. This template will be an
109+
upgraded derivative of the `azure-simple` template, with a new dependency on
110+
`azure-common-infra` and will contain the following:
111+
112+
- Managed Identity System Level for AKS
113+
- Pod Identity Security Policy
114+
- Backend State
115+
116+
**Sample `Main.tf`**
117+
118+
```
119+
resource "azurerm_resource_group" "aks_rg" {
120+
name = local.aks_rg_name
121+
location = local.region
122+
}
123+
124+
module "aks-gitops" {
125+
source = "github.com/microsoft/bedrock?ref=aks_msi_integration//cluster/azure/aks-gitops"
126+
127+
acr_enabled = true
128+
agent_vm_count = var.aks_agent_vm_count
129+
agent_vm_size = var.aks_agent_vm_size
130+
cluster_name = local.aks_cluster_name
131+
dns_prefix = local.aks_dns_prefix
132+
flux_recreate = var.flux_recreate
133+
gc_enabled = true
134+
msi_enabled = true
135+
gitops_ssh_url = var.gitops_ssh_url
136+
gitops_ssh_key = var.gitops_ssh_key_file
137+
gitops_path = var.gitops_path
138+
gitops_poll_interval = var.gitops_poll_interval
139+
gitops_label = var.gitops_label
140+
gitops_url_branch = var.gitops_url_branch
141+
kubernetes_version = var.kubernetes_version
142+
resource_group_name = azurerm_resource_group.aks_rg.name
143+
service_principal_id = module.app_management_service_principal.service_principal_application_id
144+
service_principal_secret = module.app_management_service_principal.service_principal_password
145+
ssh_public_key = file(var.ssh_public_key_file)
146+
vnet_subnet_id = module.vnet.vnet_subnet_ids[0]
147+
network_plugin = var.network_plugin
148+
network_policy = var.network_policy
149+
oms_agent_enabled = var.oms_agent_enabled
150+
}
151+
```
152+
153+
Questions & Limitations:
154+
155+
- With the deployment of the `azure-common-infra` template for Key Vault, will
156+
that also need to be modified for Manage Identity to whitelist AKS to access
157+
keyvault?
158+
159+
### 3.3 Testing for Managed Identity enable AKS Gitops Environments
160+
161+
The testing for the Managed Identity enabled AKS gitops environment will
162+
incorporate the aforementioned new Infrastructure DevOps Model Flow for
163+
Terraform to assess pod identity access for a Voting App service deployed using
164+
terraform and a flux manifest repository.
165+
166+
#### Unit Tests
167+
168+
Cobalt Test Fixtures includes a library that simplifies writing unit terraform
169+
tests against templates. It extracts out pieces of this process and provides a
170+
static validation for a json sample output per module. For this, we require Unit
171+
Tests for the following modules:
172+
173+
- AKS
174+
- Key Vault
175+
- VNet
176+
- Subnet
177+
- Gitops
178+
179+
#### Integration Tests
180+
181+
Integration tests will validate resource interoperability upon deployment.
182+
Pending a successful `terraform apply`, using a go script and terratest go
183+
library, this design will create an integration test for the respective
184+
environment template that verifies
185+
186+
- Access to cluster through MI
187+
- Flux namespace
188+
- Access to voting app using Pod Identity
189+
- Access to key using flex-volume
190+
([Unable to use Env Vars](https://github.com/Azure/kubernetes-keyvault-flexvol/issues/28))
191+
- 200 response on Voting App
192+
193+
#### Acceptance Test
194+
195+
Acceptance tests are defined in this design as a system affirmation that the
196+
incoming PR has a successful build in a live staging environment once applied.
197+
Maintain a live QA environment that successful builds from an incoming PR are
198+
applied to the state file.
199+
200+
Questions & Limitations:
201+
202+
- With an incoming change to an azure provider module, how will this be applied
203+
to an existing terraform deployment. If fail, should we redeploy a new
204+
`azure-MI` environment for QA?
205+
206+
#### Reporting
207+
208+
Output a test failure report using out-of-box terratest JUnit compiler to
209+
capture errors thrown during build.
210+
211+
The whitelisted integration test for `azure-MI` will include:
212+
213+
> `go test -v -run TestIT_Bedrock_AzureMI_Test -timeout 99999s | tee TestIT_Bedrock_AzureMI_Test.log`
214+
215+
> `terratest_log_parser -testlog TestIT_Bedrock_AzureSimple_Test.log -outputdir single_test_output`
216+
217+
The pipeline will publish the XML report as an artifact that is uniquely named
218+
to AzDO.
219+
220+
```
221+
task: PublishPipelineArtifact@1
222+
inputs:
223+
path: $(modulePath)/test/single_test_output
224+
artifact: simple_test_logs
225+
condition: always()
226+
- task: PublishTestResults@2
227+
inputs:
228+
testResultsFormat: 'JUnit'
229+
testResultsFiles: '**/report.xml'
230+
searchFolder: $(modulePath)/test
231+
condition: and(eq(variables['Agent.JobStatus'], 'Succeeded'), endsWith(variables['Agent.JobName'], 'Bedrock_Build_Azure_MI'))
232+
```
233+
234+
## 4. Dependencies
235+
236+
This design for a Managed Identity AKS Testing Harness will leverage the
237+
following:
238+
239+
- [Bedrock Pre-Reqs: az cli | terraform | golang | fabrikate ](https://github.com/microsoft/bedrock/tree/master/tools/prereqs)
240+
- [Terratest](https://github.com/gruntwork-io/terratest)
241+
- [Terraform Compliance](https://github.com/eerkunt/terraform-compliance)
242+
- [Cobalt Terraform Test Fixtures](https://github.com/microsoft/cobalt/tree/master/test-harness)
243+
244+
## 5. Risks & Mitigations
245+
246+
Risks & Limitations:
247+
248+
- With the deployment of the `azure-common-infra` template for Key Vault, will
249+
that also need to be modified for Manage Identity to whitelist AKS to access
250+
keyvault?
251+
- With an incoming change to an azure provider module, how will this be applied
252+
to an existing terraform deployment. If fail, should we redeploy a new
253+
`azure-MI` environment for QA?
254+
- How long does it take to deploy MI and Keyvault in a pipeline?
255+
256+
## 6. Documentation
257+
258+
Yes, Documentation will need to be added to the new terraform environment and
259+
the Bedrock testing guidance.

0 commit comments

Comments
 (0)