Reset, don't remove sub-resource properties to avoid wrong diffs#3054
Conversation
Does the PR have any schema changes?Looking good! No breaking changes found. |
6c20810 to
ec2b31e
Compare
Codecov ReportAttention:
Additional details and impacted files@@ Coverage Diff @@
## master #3054 +/- ##
==========================================
- Coverage 60.70% 60.68% -0.02%
==========================================
Files 66 66
Lines 11028 11007 -21
==========================================
- Hits 6694 6680 -14
+ Misses 3787 3782 -5
+ Partials 547 545 -2 ☔ View full report in Codecov by Sentry. |
|
My PR here has the tests that caught the original bug that I filed: #3056 |
danielrbradley
left a comment
There was a problem hiding this comment.
I think this is probably good, though it's still tricky to reason about how wide the impact will be.
I feel like we need a succinct description of this feature written down somewhere - like a short spec which describes the behaviour. Perhaps this could go into the contributing guide where we discuss things like the versioning mechanisms or perhaps a block alongside the tests could be easier to find?
My gut feel is that the tests are still a little patchy and are missing a few edge cases - though I haven't gathered proper evidence for this.
| oldInputs := resource.PropertyMap{} | ||
| sdkResponse := map[string]any{ | ||
| "properties": map[string]any{ | ||
| "accessPolicies": []any{ | ||
| "a policy", | ||
| }, | ||
| }, | ||
| } | ||
| actual := provider.resetUnsetSubResourceProperties(ctx, "urn", sdkResponse, oldInputs, res) | ||
| expected := map[string]any{ | ||
| "properties": map[string]any{ | ||
| "accessPolicies": []any{}, | ||
| }, | ||
| } |
There was a problem hiding this comment.
Looks like this is the actual core of the change right? We're now expecting the access policies to still be there but be set to an empty value?
There was a problem hiding this comment.
Correct. Because on Create, we set such properties to their default value, which is [], so we want it here, too, to avoid diffs.
ec2b31e to
d39092b
Compare
## Overview This PR fixes a regression causing the provider to inadvertently drop the sub-resources following a `pulumi refresh`. This leads to flapping attributes as described in #4145, upon a subsequent `pulumi up` or simply another `pulumi refresh`. The problem manifests only for externally-managed sub-resources (as is generally recommended). For example, a `NetworkSecurityGroup` resource with separate (not inline) `SecurityRule` resources. In more detail, `refresh` causes two erroneous changes to the checkpointed state of a given sub-resource property (e.g. `privateEndpointConnections` or `securityRules`): 1. sets the _output property_ to `[]`, even if there's a meaningful value (the discovered sub-resources). 2. sets the _input property_ to `[]`, even though the program has no inline resources. From this point, a subsequent `pulumi up` or `pulumi refresh` will show a spurious change. Closes #4145 ## Background This PR essentially reverts some undesirable changes in 642a38b. In #3054, the Read method was altered to remove sub-resources from the checkpointed state ([here](https://github.com/pulumi/pulumi-azure-native/pull/3054/files#diff-418e6e7d93aeacebe7c22d5fb674b48bfa15c6a514eb3ae4744b86d447595d8eR1143)). The property map named `outputsWithoutIgnores` seems to have been mistaken to contain outputs, but is best understood as an intermediate form of the inferred inputs, which is why the sub-resources are stripped.  PR #3054 also switches from _removing_ to _resetting_ the properties, with the unexpected effect of adding the property into the _inputs_. This causes flapping in a subsequent `refresh` because it confuses `findUnsetPropertiesToMaintain` to think that the sub-resources are inline.  I believe that #3054 was effective at solving #3049 because it incorporated Mikhail's insight about the SDK shape, [here](https://github.com/pulumi/pulumi-azure-native/pull/3054/files#diff-418e6e7d93aeacebe7c22d5fb674b48bfa15c6a514eb3ae4744b86d447595d8eR1507). The changes that we now seek to revert were not germane to #3049. ## Example 1 Deploy the following program twice, setting the `revision` tag to `2` after the first deployment. Twice is needed because this program creates an security rule within the NSG, but the NSG's state will not reflect the security rule until after an update to the NSG. Refresh would not be effective because that's the very problem this PR seeks to fix. ```csharp using Pulumi; using Pulumi.AzureNative.Resources; using Pulumi.AzureNative.Network; // using Pulumi.AzureNative.Network.Inputs; using System.Collections.Generic; return await Pulumi.Deployment.RunAsync(() => { // Create an Azure Resource Group var resourceGroup = new ResourceGroup("resourceGroup"); var _name = "eron"; var nsg = new NetworkSecurityGroup($"data-center-{_name}-nsg", new NetworkSecurityGroupArgs { ResourceGroupName = resourceGroup.Name, Tags = { { "revision", "1" }, }, }); var rule = new SecurityRule($"NSG-AllowVirtualNetworkRdpInbound-{_name}", new SecurityRuleArgs { SecurityRuleName = "AllowVirtualNetworkRdpInbound", Priority = 120, Direction = SecurityRuleDirection.Inbound, Access = SecurityRuleAccess.Allow, Protocol = SecurityRuleProtocol.Tcp, SourcePortRange = "*", DestinationPortRange = "3389", SourceAddressPrefix = "VirtualNetwork", DestinationAddressPrefix = "VirtualNetwork", ResourceGroupName = resourceGroup.Name, NetworkSecurityGroupName = nsg.Name, Description = "Allows internal RDP Traffic" }); // Export the primary key of the Storage Account return new Dictionary<string, object?> { // ["nsg"] = nsg.ID() }; }); ``` Export the state, run `pulumi refresh`, and export again. You should observe that the NSG's inputs and outputs have been destabilized. Try refreshes and ups to see flapping. See attachments, for a series of state exports: - [outputs-repro.zip](https://github.com/user-attachments/files/20600583/outputs-repro.zip) - [outputs-fix.zip](https://github.com/user-attachments/files/20600584/outputs-fix.zip) ## Example 2 This example is closer to what's described in #4145. The program creates a private endpoint, so that the EventHub namespace has a private endpoint connection that we may observe to be 'flappy'. ```ts import * as pulumi from "@pulumi/pulumi"; import * as resources from "@pulumi/azure-native/resources"; import * as eventhub from "@pulumi/azure-native/eventhub"; import * as network from "@pulumi/azure-native/network"; // Configuration for the subnet ID const config = new pulumi.Config(); const subnetId = config.require("subnetId"); // Create an Azure Resource Group const resourceGroup = new resources.ResourceGroup("eventhub-rg"); const namespace = new eventhub.v20230101preview.Namespace("namespace", { isAutoInflateEnabled: true, maximumThroughputUnits: 40, minimumTlsVersion: "1.2", publicNetworkAccess: "Enabled", resourceGroupName: resourceGroup.name, sku: { capacity: 1, name: "Standard", tier: "Standard" }, tags: { "Name": "mynamespace", "revision": "3" } }); const privateEndpoint = new network.PrivateEndpoint("privateEndpoint", { resourceGroupName: resourceGroup.name, location: resourceGroup.location, privateEndpointName: "myPrivateEndpoint", subnet: { id: subnetId, }, privateLinkServiceConnections: [{ name: "myPrivateLinkServiceConnection", privateLinkServiceId: namespace.id, groupIds: ["namespace"], }], }); const privateEndpointConnections = namespace.privateEndpointConnections; ``` [repro.zip](https://github.com/user-attachments/files/20619741/repro.zip)
This PR fixes a defect introduced by #2950. When handling sub-resource properties that may be maintained as stand-alone resources, we need to overwrite them in several code paths. Before, we set them to their default value
[]on Create, but removed them entirely on Read. This causes unnecessary diffs on Refresh. This PR standardizes the handling on resetting to the default value[].Resolves #3049