Fix IsNotFound to validate Azure error codes before removing resources#4416
Conversation
The IsNotFound() function was incorrectly treating any HTTP 404 response as a legitimate resource deletion, even when the 404 came from proxies, WAFs, or network intermediaries rather than from Azure itself. This caused resources to be unexpectedly removed from Pulumi state. Changes: - Updated IsNotFound() to validate error codes before confirming deletion - Valid Azure "not found" codes: NotFound, ResourceNotFound, ResourceGroupNotFound - Added warning logs when 404 lacks valid Azure error codes - Returns false (preserves resource) when error code validation fails - Applied to all error types: azure.RequestError, azcore.ResponseError, PulumiAzcoreResponseError Testing: - Added comprehensive test coverage for all scenarios - Tests verify 404 with valid codes returns true - Tests verify 404 without codes returns false - All azure package tests passing Fixes #4415 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
Does the PR have any schema changes?Looking good! No breaking changes found. |
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #4416 +/- ##
==========================================
+ Coverage 59.37% 59.43% +0.05%
==========================================
Files 91 91
Lines 11450 11478 +28
==========================================
+ Hits 6799 6822 +23
- Misses 4015 4020 +5
Partials 636 636 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Manual Test Results ✅I've successfully tested the Test SetupBuilt and installed local provider with the fix: make provider
cp bin/pulumi-resource-azure-native ~/go/bin/Created test stack with:
Test 1: Individual Resource Deletion (ResourceNotFound)Actions:
Result: ✅ PASSED The provider correctly detected the deletion with a valid Azure error code and removed the resource from state: Key observations:
Test 2: Resource Group Deletion (ResourceGroupNotFound)Actions:
Result: ✅ PASSED The provider correctly detected the resource group deletion with the Key observations:
VerificationFinal state confirmed both resources removed: $ pulumi stack --show-urns
Current stack resources (2):
TYPE NAME
pulumi:pulumi:Stack isnotfound-test-dev
└─ pulumi:providers:azure-native default_3_10_1ConclusionThe fix successfully: This manual test confirms the unit test results and validates that the fix will prevent the issue described in #4415 where resources were incorrectly removed from state due to proxy/WAF 404 responses. |
|
This PR has been shipped in release v3.11.0. |
) Fixes #4482 In #4416 we implemented the functionality to correctly identify "not found" errors from azure. We do this by recognizing the error code of the azure response which can be different for different resources. In the case of `ManagementLockByScope` resources, the error code is `LockNotFound`. This PR adds this to the list of recognisable errors. It is worth noting that `Read()` is doing the right thing here, returning an empty ID to signal to the engine that a resource is deleted. ```go if err != nil { if azure.IsNotFound(err) { // 404 means that the resource was deleted. return &rpc.ReadResponse{Id: ""}, nil } return nil, err } ``` The `return nil, err` is the path of the code that the user is hitting
Summary
Fixes #4415 based on the technical analysis from https://github.com/pulumi/customer-support/issues/2476#issuecomment-3544242719
The
IsNotFound()function was incorrectly treating any HTTP 404 response as a legitimate Azure resource deletion, even when the 404 came from proxies, WAFs, or network intermediaries rather than from Azure itself. This caused resources to be unexpectedly removed from Pulumi state.Problem
Resources were being incorrectly removed from state upon a non-authoritative 404 response, such as from a proxy or serving layer.
The previous implementation only checked the HTTP status code without validating that the response contained authentic Azure error codes.
Solution
Updated
IsNotFound()to validate error codes before confirming deletion:Error Code Validation: Now checks for valid Azure "not found" error codes:
NotFoundResourceNotFoundResourceGroupNotFoundLogging: Added warning logs (level 3) when 404 responses lack proper Azure error codes to help diagnose proxy/WAF issues
Safe Default: Returns
false(preserves resource in state) when error code validation failsComprehensive Coverage: Applied to all three error types:
azure.RequestError- checksServiceError.Codeazcore.ResponseError- checksErrorCodefieldPulumiAzcoreResponseError- checksErrorCodefieldTesting
truefalsefalseImpact
Resources will now only be removed from state when Azure itself confirms the resource doesn't exist with a proper error code. This prevents false positives from network intermediaries.
🤖 Generated with Claude Code