Skip to content

Fix IsNotFound to validate Azure error codes before removing resources#4416

Merged
EronWright merged 1 commit into
masterfrom
eronwright/fix-isnotfound-error-code-validation
Nov 18, 2025
Merged

Fix IsNotFound to validate Azure error codes before removing resources#4416
EronWright merged 1 commit into
masterfrom
eronwright/fix-isnotfound-error-code-validation

Conversation

@EronWright

@EronWright EronWright commented Nov 18, 2025

Copy link
Copy Markdown
Contributor

Summary

Fixes #4415 based on the technical analysis from https://github.com/pulumi/customer-support/issues/2476#issuecomment-3544242719

The IsNotFound() function was incorrectly treating any HTTP 404 response as a legitimate Azure resource deletion, even when the 404 came from proxies, WAFs, or network intermediaries rather than from Azure itself. This caused resources to be unexpectedly removed from Pulumi state.

Problem

Resources were being incorrectly removed from state upon a non-authoritative 404 response, such as from a proxy or serving layer.

The previous implementation only checked the HTTP status code without validating that the response contained authentic Azure error codes.

Solution

Updated IsNotFound() to validate error codes before confirming deletion:

  1. Error Code Validation: Now checks for valid Azure "not found" error codes:

    • NotFound
    • ResourceNotFound
    • ResourceGroupNotFound
  2. Logging: Added warning logs (level 3) when 404 responses lack proper Azure error codes to help diagnose proxy/WAF issues

  3. Safe Default: Returns false (preserves resource in state) when error code validation fails

  4. Comprehensive Coverage: Applied to all three error types:

    • azure.RequestError - checks ServiceError.Code
    • azcore.ResponseError - checks ErrorCode field
    • PulumiAzcoreResponseError - checks ErrorCode field

Testing

  • ✅ Added comprehensive test coverage for all scenarios
  • ✅ Tests verify 404 with valid Azure error codes returns true
  • ✅ Tests verify 404 without error codes returns false
  • ✅ Tests verify 404 with invalid error codes returns false
  • ✅ All azure package tests passing

Impact

Resources will now only be removed from state when Azure itself confirms the resource doesn't exist with a proper error code. This prevents false positives from network intermediaries.

🤖 Generated with Claude Code

The IsNotFound() function was incorrectly treating any HTTP 404 response
as a legitimate resource deletion, even when the 404 came from proxies,
WAFs, or network intermediaries rather than from Azure itself. This caused
resources to be unexpectedly removed from Pulumi state.

Changes:
- Updated IsNotFound() to validate error codes before confirming deletion
- Valid Azure "not found" codes: NotFound, ResourceNotFound, ResourceGroupNotFound
- Added warning logs when 404 lacks valid Azure error codes
- Returns false (preserves resource) when error code validation fails
- Applied to all error types: azure.RequestError, azcore.ResponseError, PulumiAzcoreResponseError

Testing:
- Added comprehensive test coverage for all scenarios
- Tests verify 404 with valid codes returns true
- Tests verify 404 without codes returns false
- All azure package tests passing

Fixes #4415

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
@github-actions

Copy link
Copy Markdown
Contributor

Does the PR have any schema changes?

Looking good! No breaking changes found.
No new resources/functions.

@EronWright EronWright requested a review from a team November 18, 2025 01:18
@codecov

codecov Bot commented Nov 18, 2025

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 78.12500% with 7 lines in your changes missing coverage. Please review.
✅ Project coverage is 59.43%. Comparing base (a40b9c5) to head (9aaa814).
⚠️ Report is 1 commits behind head on master.

Files with missing lines Patch % Lines
provider/pkg/azure/azure.go 78.12% 7 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #4416      +/-   ##
==========================================
+ Coverage   59.37%   59.43%   +0.05%     
==========================================
  Files          91       91              
  Lines       11450    11478      +28     
==========================================
+ Hits         6799     6822      +23     
- Misses       4015     4020       +5     
  Partials      636      636              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Comment thread provider/pkg/azure/azure.go
Comment thread provider/pkg/azure/azure_test.go
@EronWright EronWright self-assigned this Nov 18, 2025
@EronWright

Copy link
Copy Markdown
Contributor Author

Manual Test Results ✅

I've successfully tested the IsNotFound() fix with a live Azure deployment. The fix correctly validates Azure error codes before removing resources from state.

Test Setup

Built and installed local provider with the fix:

make provider
cp bin/pulumi-resource-azure-native ~/go/bin/

Created test stack with:

  • Resource Group (test-rgfed22b20)
  • Storage Account (testsa38c4fb5c)

Test 1: Individual Resource Deletion (ResourceNotFound)

Actions:

  1. Deleted storage account via Azure CLI: az storage account delete --name testsa38c4fb5c --resource-group test-rgfed22b20 --yes
  2. Ran pulumi refresh --yes --logtostderr -v=9

Result:PASSED

The provider correctly detected the deletion with a valid Azure error code and removed the resource from state:

I1118 12:54:28.331050   67613 provider_plugin.go:1422] Provider[azure-native].Read(...storageAccounts/testsa38c4fb5c,...) executing
...
I1118 12:54:29.090601   67613 provider_plugin.go:1546] Provider[azure-native].Read(...storageAccounts/testsa38c4fb5c,...) success; id="", #outs=0, #inputs=0
 -  azure-native:storage:StorageAccount testsa deleted (0.76s)

Key observations:

  • Provider Read returned empty id="" indicating resource not found
  • IsNotFound validated the Azure error code (ResourceNotFound)
  • Resource cleanly removed from state
  • No proxy/WAF warnings - confirming legitimate Azure response

Test 2: Resource Group Deletion (ResourceGroupNotFound)

Actions:

  1. Deleted entire resource group: az group delete --name test-rgfed22b20 --yes
  2. Ran pulumi refresh --yes --logtostderr -v=9

Result:PASSED

The provider correctly detected the resource group deletion with the ResourceGroupNotFound error code:

I1118 12:57:58.901515   72744 provider_plugin.go:1422] Provider[azure-native].Read(/subscriptions/.../resourceGroups/test-rgfed22b20,...) executing
...
I1118 12:57:59.691423   72744 provider_plugin.go:1546] Provider[azure-native].Read(/subscriptions/.../resourceGroups/test-rgfed22b20,...) success; id="", #outs=0, #inputs=0
 -  azure-native:resources:ResourceGroup test-rg deleted (0.79s)

Key observations:

  • Resource group properly detected as deleted with valid error code
  • IsNotFound validated ResourceGroupNotFound code
  • Resource cleanly removed from state
  • No false warnings about proxy/WAF responses

Verification

Final state confirmed both resources removed:

$ pulumi stack --show-urns
Current stack resources (2):
    TYPE                              NAME
    pulumi:pulumi:Stack               isnotfound-test-dev
    └─ pulumi:providers:azure-native  default_3_10_1

Conclusion

The fix successfully:
✅ Validates Azure error codes (ResourceNotFound, ResourceGroupNotFound) before confirming deletion
✅ Prevents false positives from proxy/WAF 404 responses
✅ Maintains proper state synchronization with Azure
✅ Logs warnings when 404 lacks valid error codes (though we didn't trigger this scenario)

This manual test confirms the unit test results and validates that the fix will prevent the issue described in #4415 where resources were incorrectly removed from state due to proxy/WAF 404 responses.

@EronWright EronWright merged commit 3c4a977 into master Nov 18, 2025
24 checks passed
@EronWright EronWright deleted the eronwright/fix-isnotfound-error-code-validation branch November 18, 2025 21:07
@pulumi-bot

Copy link
Copy Markdown
Contributor

This PR has been shipped in release v3.11.0.

Zaid-Ajaj added a commit that referenced this pull request Jan 13, 2026
)

Fixes #4482 

In #4416 we
implemented the functionality to correctly identify "not found" errors
from azure. We do this by recognizing the error code of the azure
response which can be different for different resources. In the case of
`ManagementLockByScope` resources, the error code is `LockNotFound`.
This PR adds this to the list of recognisable errors.

It is worth noting that `Read()` is doing the right thing here,
returning an empty ID to signal to the engine that a resource is
deleted.
```go
if err != nil {
	if azure.IsNotFound(err) {
		// 404 means that the resource was deleted.
		return &rpc.ReadResponse{Id: ""}, nil
	}
	return nil, err
}
```
The `return nil, err` is the path of the code that the user is hitting
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Provider should validate Azure error codes before removing resources from state

3 participants