OCPBUGS-74573: Add validation for AzureManaged boot diagnostics on Azure Stack Hub#183
Conversation
|
@gpei: This pull request references Jira Issue OCPBUGS-74573, which is invalid:
Comment The bug has been updated to refer to the pull request using the external bug tracker. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
/jira refresh |
|
@gpei: This pull request references Jira Issue OCPBUGS-74573, which is valid. The bug has been moved to the POST state. 3 validation(s) were run on this bug
Requesting review from QA contact: DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
ca79440 to
a48a34e
Compare
a48a34e to
194549c
Compare
WalkthroughThe Changes
Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes 🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
@gpei: This pull request references Jira Issue OCPBUGS-74573, which is valid. 3 validation(s) were run on this bug
Requesting review from QA contact: DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
There was a problem hiding this comment.
🧹 Nitpick comments (2)
pkg/cloud/azure/actuators/machine/reconciler_test.go (1)
604-622: Consider explicitly setting CloudEnv for public Azure test case.The test case name implies it's testing "public Azure" behavior, but the scope doesn't explicitly set
CloudEnv. While this works (emptycloudEnvis not Stack Hub), consider usingNewFakeMachineScopewithCloudEnv: string(configv1.AzurePublicCloud)for consistency with the new Stack Hub test cases and to make the test's intent clearer.🔧 Suggested improvement for explicitness
{ name: "with an Azure managed storage account on public Azure", - scope: &actuators.MachineScope{ - Machine: &machinev1.Machine{}, - }, + scope: actuators.NewFakeMachineScope(actuators.FakeMachineScopeParams{ + Machine: &machinev1.Machine{}, + CloudEnv: string(configv1.AzurePublicCloud), + Context: context.Background(), + }), config: &machinev1.AzureMachineProviderSpec{🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/cloud/azure/actuators/machine/reconciler_test.go` around lines 604 - 622, The test case for "with an Azure managed storage account on public Azure" should explicitly set the CloudEnv to public Azure for clarity; update the test to construct the scope via NewFakeMachineScope (or set scope.CloudEnv) and assign CloudEnv: string(configv1.AzurePublicCloud) on the MachineScope used in the test so the intent matches the other Stack Hub cases and the name accurately reflects the environment being exercised (refer to MachineScope, NewFakeMachineScope, and configv1.AzurePublicCloud).pkg/cloud/azure/actuators/machine/reconciler.go (1)
912-950: Consider adding webhook validation for earlier user feedback.Per the previous review discussion, a validating webhook could catch this invalid configuration earlier and provide immediate feedback to users, avoiding error loops in the controller. If the codebase has similar webhook validations for other Azure Stack Hub constraints, this would be a good candidate to add there as well.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/cloud/azure/actuators/machine/reconciler.go` around lines 912 - 950, Add proactive validation via the AzureMachine validating webhook to reject configurations that will fail at runtime: detect when spec.Diagnostics.Boot.StorageAccountType == machinev1.AzureManagedAzureDiagnosticsStorage and the cluster is an Azure Stack Hub (same logic used in createDiagnosticsConfig and scope.IsStackHub()) and return a clear admission error; also validate customer-managed configs (require CustomerManaged != nil and StorageAccountURI non-empty) mirroring the checks in createDiagnosticsConfig so the webhook rejects missing/invalid StorageAccountURI up-front. Ensure the webhook handler uses the AzureMachineProviderSpec type and error messages consistent with the controller (same wording as in createDiagnosticsConfig) so users receive immediate feedback rather than controller error loops.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@pkg/cloud/azure/actuators/machine/reconciler_test.go`:
- Around line 604-622: The test case for "with an Azure managed storage account
on public Azure" should explicitly set the CloudEnv to public Azure for clarity;
update the test to construct the scope via NewFakeMachineScope (or set
scope.CloudEnv) and assign CloudEnv: string(configv1.AzurePublicCloud) on the
MachineScope used in the test so the intent matches the other Stack Hub cases
and the name accurately reflects the environment being exercised (refer to
MachineScope, NewFakeMachineScope, and configv1.AzurePublicCloud).
In `@pkg/cloud/azure/actuators/machine/reconciler.go`:
- Around line 912-950: Add proactive validation via the AzureMachine validating
webhook to reject configurations that will fail at runtime: detect when
spec.Diagnostics.Boot.StorageAccountType ==
machinev1.AzureManagedAzureDiagnosticsStorage and the cluster is an Azure Stack
Hub (same logic used in createDiagnosticsConfig and scope.IsStackHub()) and
return a clear admission error; also validate customer-managed configs (require
CustomerManaged != nil and StorageAccountURI non-empty) mirroring the checks in
createDiagnosticsConfig so the webhook rejects missing/invalid StorageAccountURI
up-front. Ensure the webhook handler uses the AzureMachineProviderSpec type and
error messages consistent with the controller (same wording as in
createDiagnosticsConfig) so users receive immediate feedback rather than
controller error loops.
ℹ️ Review info
Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml
Review profile: CHILL
Plan: Pro
Cache: Disabled due to data retention organization setting
Knowledge base: Disabled due to data retention organization setting
📒 Files selected for processing (2)
pkg/cloud/azure/actuators/machine/reconciler.gopkg/cloud/azure/actuators/machine/reconciler_test.go
… Azure Stack Hub Azure Stack Hub does not support Azure-managed boot diagnostics storage and requires customer-managed storage accounts. Without validation, users experience cryptic 400 errors from the Azure API when attempting to use AzureManaged boot diagnostics on Stack Hub. This change adds validation in the createDiagnosticsConfig function to detect when AzureManaged boot diagnostics is configured on Azure Stack Hub and returns a clear InvalidMachineConfiguration error with an actionable message directing users to use CustomerManaged boot diagnostics instead. Fixes: OCPBUGS-74573 Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
There was a problem hiding this comment.
🧹 Nitpick comments (1)
pkg/cloud/azure/actuators/machine/reconciler.go (1)
912-925: Consider guardingscopebefore callingscope.IsStackHub().This helper now dereferences
scope; adding a defensive check avoids a panic if a future caller passesnil.Suggested hardening diff
func createDiagnosticsConfig(scope *actuators.MachineScope, config *machinev1.AzureMachineProviderSpec) (*armcompute.DiagnosticsProfile, error) { boot := config.Diagnostics.Boot if boot == nil { return nil, nil } switch boot.StorageAccountType { case machinev1.AzureManagedAzureDiagnosticsStorage: + if scope == nil { + return nil, fmt.Errorf("cannot validate AzureManaged boot diagnostics: machine scope is nil") + } // Validate that AzureManaged boot diagnostics is not used on Azure Stack Hub if scope.IsStackHub() { return nil, machinecontroller.InvalidMachineConfiguration( "AzureManaged boot diagnostics is not supported on Azure Stack Hub", ) }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/cloud/azure/actuators/machine/reconciler.go` around lines 912 - 925, The helper createDiagnosticsConfig dereferences scope when calling scope.IsStackHub(), so add a defensive nil check for scope at the start of that function (createDiagnosticsConfig) and handle a nil scope gracefully (return a clear error via machinecontroller.InvalidMachineConfiguration or nil as appropriate) before any calls to scope.IsStackHub(); ensure all early returns use the same error-handling pattern used elsewhere in the function so callers get a consistent error when scope is nil.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@pkg/cloud/azure/actuators/machine/reconciler.go`:
- Around line 912-925: The helper createDiagnosticsConfig dereferences scope
when calling scope.IsStackHub(), so add a defensive nil check for scope at the
start of that function (createDiagnosticsConfig) and handle a nil scope
gracefully (return a clear error via
machinecontroller.InvalidMachineConfiguration or nil as appropriate) before any
calls to scope.IsStackHub(); ensure all early returns use the same
error-handling pattern used elsewhere in the function so callers get a
consistent error when scope is nil.
ℹ️ Review info
Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml
Review profile: CHILL
Plan: Pro
Cache: Disabled due to data retention organization setting
Knowledge base: Disabled due to data retention organization setting
📒 Files selected for processing (2)
pkg/cloud/azure/actuators/machine/reconciler.gopkg/cloud/azure/actuators/machine/reconciler_test.go
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: damdo The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
@sunzhaohua2 @huali9 could you please coordinate with @gpei for verification? Thanks! |
|
/verified by @sunzhaohua2 After the fix |
|
@sunzhaohua2: This PR has been marked as verified by DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
@gpei: all tests passed! Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
|
@gpei: Jira Issue Verification Checks: Jira Issue OCPBUGS-74573 Jira Issue OCPBUGS-74573 has been moved to the MODIFIED state and will move to the VERIFIED state when the change is available in an accepted nightly payload. 🕓 DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
Fix included in accepted release 4.22.0-0.nightly-2026-03-10-100251 |
Azure Stack Hub does not support
AzureManagedboot diagnostics storage and requiresUserManagedstorage accounts. Without validation, users experience cryptic 400 errors from the Azure API when attempting to useAzureManagedboot diagnostics on Stack Hub.This change adds validation in the
createDiagnosticsConfigfunction to detect whenAzureManagedboot diagnostics is configured on Azure Stack Hub and returns a clear InvalidMachineConfiguration error.Summary by CodeRabbit