add build expiry#239
Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughAdds optional per-build TTL and expiry tracking: API types gain Changes
Sequence Diagram(s)sequenceDiagram
participant Client
participant APIServer
participant K8sAPI
participant OperatorConfig
participant Controller
Client->>APIServer: POST /build (body includes ttl)
APIServer->>APIServer: resolveAndClampTTL(req.TTL)
APIServer->>OperatorConfig: loadOperatorConfig()
OperatorConfig-->>APIServer: max/default TTL
APIServer->>APIServer: parse & clamp TTL
APIServer->>K8sAPI: Create ImageBuild (spec.ttl)
K8sAPI-->>APIServer: Created
APIServer-->>Client: BuildResponse (expiresAt)
K8sAPI->>Controller: Notify ImageBuild creation/update
Controller->>Controller: Reconcile()
Controller->>Controller: checkExpiry() -> resolveEffectiveTTL()
alt Not expired
Controller->>K8sAPI: Update status.expiresAt
K8sAPI-->>Controller: Status updated
Controller->>Controller: Requeue until expiresAt
else Expired
Controller->>K8sAPI: Emit BuildExpired event
Controller->>K8sAPI: Delete ImageBuild
K8sAPI-->>Controller: Deleted
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 7
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
internal/controller/imagebuild/controller.go (1)
142-172:⚠️ Potential issue | 🔴 CriticalCritical:
checkExpiry'sRequeueAfteris dropped for non-expired builds — expiry will not fire on time.When the build is not yet expired,
checkExpiryreturnsctrl.Result{RequeueAfter: remaining}, false, nil. The conditionexpired || err != nilis false, so the returnedresultis discarded and reconciliation falls through to the phase switch. Terminal phases (phaseCompletedreturnsctrl.Result{}, nil;phaseFailedthe same when no cleanup error) return without any requeue, so the controller will not wake up atexpiresAtand theImageBuildwill only be deleted on the next controller-runtime resync (up to ~10h later, assuming defaults), or whenever something else triggers reconciliation. The feature therefore does not delete builds reliably at their TTL.Additionally, requeue signals from
checkExpiryduring non-terminal phases are similarly discarded.🐛 Proposed fix: propagate the requeue and take the shorter of the two
- if result, expired, err := r.checkExpiry(ctx, imageBuild); expired || err != nil { - return result, err - } - - switch imageBuild.Status.Phase { + expiryResult, expired, err := r.checkExpiry(ctx, imageBuild) + if expired || err != nil { + return expiryResult, err + } + + var phaseResult ctrl.Result + var phaseErr error + switch imageBuild.Status.Phase { case "": - return r.handleInitialState(ctx, imageBuild) + phaseResult, phaseErr = r.handleInitialState(ctx, imageBuild) case "Uploading": - return r.handleUploadingState(ctx, imageBuild) + phaseResult, phaseErr = r.handleUploadingState(ctx, imageBuild) case phaseBuilding: - return r.handleBuildingState(ctx, imageBuild) + phaseResult, phaseErr = r.handleBuildingState(ctx, imageBuild) case "Pushing": - return r.handlePushingState(ctx, imageBuild) + phaseResult, phaseErr = r.handlePushingState(ctx, imageBuild) case "Flashing": - return r.handleFlashingState(ctx, imageBuild) + phaseResult, phaseErr = r.handleFlashingState(ctx, imageBuild) case phaseCompleted: - return r.handleCompletedState(ctx, imageBuild) + phaseResult, phaseErr = r.handleCompletedState(ctx, imageBuild) case phaseFailed: if err := r.cleanupTransientSecrets(ctx, imageBuild, r.Log); err != nil { - return ctrl.Result{RequeueAfter: secretCleanupRequeue}, nil + phaseResult = ctrl.Result{RequeueAfter: secretCleanupRequeue} } - return ctrl.Result{}, nil default: log.Info("Unknown phase", "phase", imageBuild.Status.Phase) - return ctrl.Result{}, nil } + if phaseErr != nil { + return phaseResult, phaseErr + } + // Ensure we still wake up at TTL even if the phase result has no sooner requeue. + if expiryResult.RequeueAfter > 0 && + (phaseResult.RequeueAfter == 0 || expiryResult.RequeueAfter < phaseResult.RequeueAfter) { + phaseResult.RequeueAfter = expiryResult.RequeueAfter + } + return phaseResult, nilAlternatively, keep the early-return flow but also return when
expiryResult.RequeueAfter > 0 && imageBuild.Status.Phase is terminal, and watchStatus.ExpiresAtchanges to re-trigger.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/controller/imagebuild/controller.go` around lines 142 - 172, checkExpiry's ctrl.Result (RequeueAfter) is being dropped when expired==false, so expiry requeues never fire; capture the result returned by checkExpiry (e.g. expiryResult, expired, err := r.checkExpiry(...)) and propagate its RequeueAfter when appropriate instead of discarding it: call checkExpiry first, return immediately if expired or err!=nil as before, but for non-expired cases keep expiryResult and when returning from the phase switch (including terminal phases handled by handleCompletedState, phaseFailed/cleanupTransientSecrets, and other handlers like handleInitialState/handleUploadingState/handleBuildingState/handlePushingState/handleFlashingState) ensure you return the shorter/min RequeueAfter between the handler's result and expiryResult (or expiryResult if the handler returns no requeue) so the controller wakes at ExpiresAt; alternatively, if keeping early returns, ensure terminal-phase returns honor expiryResult.RequeueAfter > 0 and requeue accordingly.
🧹 Nitpick comments (2)
internal/controller/imagebuild/controller.go (2)
52-73: Nit: colocateeventReasonBuildExpiredwith the other event-reason constants.
eventReasonBuildExpiredwas added to themaxK8sNameLengthblock at line 54, but all othereventReason*constants live in the separate block at lines 60-73. Moving it there improves discoverability and keeps the grouping intentional.♻️ Proposed diff
maxK8sNameLength = 63 - - eventReasonBuildExpired = "BuildExpired" ) @@ const ( eventReasonPhaseChanged = "PhaseChanged" @@ eventReasonDiskBuildDone = "DiskBuildCompleted" + eventReasonBuildExpired = "BuildExpired" )🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/controller/imagebuild/controller.go` around lines 52 - 73, Move the eventReasonBuildExpired constant out of the block with maxK8sNameLength and colocate it with the other eventReason* constants; specifically, remove eventReasonBuildExpired from the block containing maxK8sNameLength and add it into the const block that contains eventReasonPhaseChanged, eventReasonPipelineRunReady, eventReasonUploadPodReady, etc., so all eventReason* constants (including eventReasonBuildExpired) are grouped together for discoverability.
456-480: Default TTL fallback short-circuits the operator-config getter.When
r.Get(...)succeeds butoperatorConfig.Spec.OSBuilds == nil, theelsebranch setsttlStr = automotivev1alpha1.DefaultBuildTTL. This is equivalent to whatGetDefaultBuildTTL()would return on a nil receiver, so the logic is correct, but the asymmetry (sometimes consulting the getter, sometimes the constant directly) is a bit awkward. More importantly: if the API fetch fails for any non-NotFoundreason (e.g., transient API-server hiccup), the error is swallowed and we silently fall back to the hardcoded default, which can quietly override a cluster-admin-configured TTL. Consider surfacing non-NotFounderrors so the caller can requeue instead of applying the wrong TTL.♻️ Proposed diff
if ttlStr == "" { operatorConfig := &automotivev1alpha1.OperatorConfig{} - if err := r.Get(ctx, types.NamespacedName{ - Name: "config", Namespace: OperatorNamespace, - }, operatorConfig); err == nil && operatorConfig.Spec.OSBuilds != nil { - ttlStr = operatorConfig.Spec.OSBuilds.GetDefaultBuildTTL() - } else { - ttlStr = automotivev1alpha1.DefaultBuildTTL - } + err := r.Get(ctx, types.NamespacedName{ + Name: "config", Namespace: OperatorNamespace, + }, operatorConfig) + switch { + case err == nil: + ttlStr = operatorConfig.Spec.OSBuilds.GetDefaultBuildTTL() + case errors.IsNotFound(err): + ttlStr = automotivev1alpha1.DefaultBuildTTL + default: + return 0, fmt.Errorf("failed to read OperatorConfig for TTL: %w", err) + } }(
OSBuildsConfig.GetDefaultBuildTTL()already nil-checks the receiver, so this is safe.)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/controller/imagebuild/controller.go` around lines 456 - 480, resolveEffectiveTTL currently swallows any r.Get(...) error and falls back to the hardcoded DefaultBuildTTL; change the error handling so non-NotFound API errors are returned to the caller (so the reconcile can requeue) while only treating NotFound as "use default". Concretely: after calling r.Get(ctx, ..., operatorConfig) if err != nil then if !apierrors.IsNotFound(err) return 0, err else set ttlStr = automotivev1alpha1.DefaultBuildTTL; if err == nil set ttlStr = operatorConfig.Spec.OSBuilds.GetDefaultBuildTTL() (GetDefaultBuildTTL already nil-checks the receiver). Import k8s.io/apimachinery/pkg/api/errors as apierrors and update resolveEffectiveTTL accordingly.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@api/v1alpha1/imagebuild_types.go`:
- Around line 80-85: You added a new field Spec.TTL and Status.ExpiresAt but did
not regenerate code/CRDs; run the generator to update deepcopy and CRD
manifests: execute "make generate manifests" (or the project equivalent) to
regenerate zz_generated.deepcopy.go and the CRD YAMLs so
ImageBuildStatus.ExpiresAt (pointer) is deep-copied and the CRD
validation/printer columns include Spec.TTL, then commit the generated files.
- Around line 80-85: The TTL behavior is inconsistent with its documentation:
either make expiry only apply to terminal builds or document that TTL can delete
in-flight builds; to fix, decide which behavior you want and implement one of
two changes — (A) restrict the expiry check in checkExpiry
(internal/controller/imagebuild/controller.go) to only consider builds with
CompletionTime set or Phase == "Completed"/"Failed" (so in-progress phases are
excluded), or (B) if mid-flight expiry is intended, update the TTL field comment
in ImageBuild (api/v1alpha1/imagebuild_types.go) and the related OperatorConfig
comment (api/v1alpha1/operatorconfig_types.go) to state TTL applies to any phase
and warn about short user-set TTLs interacting with long uploads; ensure the
updated docs mention the "0" disable sentinel and the potential for in-progress
deletion.
In `@api/v1alpha1/operatorconfig_types.go`:
- Around line 560-564: The field comment for MaxBuildTTL is inaccurate:
resolveAndClampTTL in internal/buildapi/server.go currently returns an error
when a requested TTL exceeds MaxBuildTTL rather than silently clamping it.
Update the MaxBuildTTL docstring to state that requests with ttl > MaxBuildTTL
are rejected (include that it uses Go duration format and that "0" means no
maximum) so the comment matches the behavior in resolveAndClampTTL, or
alternatively modify resolveAndClampTTL to actually clamp values to MaxBuildTTL
if you prefer the original wording — but ensure the MaxBuildTTL comment and the
resolveAndClampTTL behavior are consistent.
In `@internal/buildapi/server.go`:
- Around line 1520-1527: The code currently ignores the error from
loadOperatorConfigFn; change the logic in the block that calls
loadOperatorConfigFn so that if loadOperatorConfigFn returns a non-nil err you
only treat it as benign when apierrors.IsNotFound(err) is true—otherwise
return/surface that error (fail closed) instead of proceeding; likewise, when
reading operatorCfg.Spec.OSBuilds.GetMaxBuildTTL(), treat a time.ParseDuration
parseErr as an actual error to return (not silently ignore) so malformed TTL
policies also fail closed; reference loadOperatorConfigFn, operatorCfg,
Spec.OSBuilds.GetMaxBuildTTL, time.ParseDuration and ensure the function returns
an error when these conditions occur instead of proceeding as if no policy
exists.
In `@internal/buildapi/types.go`:
- Around line 147-148: getBuildTemplate in internal/buildapi/server.go is not
preserving BuildRequest.TTL from the template, so recreated builds drop the
original TTL; update getBuildTemplate to copy the TTL from the stored template
into the returned BuildRequest (e.g., assign template.TTL into the
BuildRequest/ctx.Request that getBuildTemplate returns) so the TTL value is
preserved when templates are used to recreate builds.
- Around line 147-148: Update the OpenAPI schemas used for API docs/clients so
BuildRequest includes a "ttl" property and BuildResponse includes an "expiresAt"
property: add "ttl" (type: string, JSON name "ttl", include the existing TTL
semantics in the description) to the BuildRequest schema that maps to the
BuildRequest TTL field, and add "expiresAt" (type: string, format: date-time,
JSON name "expiresAt") to the BuildResponse schema to match the BuildResponse
ExpiresAt field; ensure both the internal buildapi OpenAPI schema and the public
docs OpenAPI schema are updated so generated clients and docs reflect
BuildRequest.TTL and BuildResponse.ExpiresAt.
In `@internal/controller/imagebuild/controller.go`:
- Around line 391-454: The checkExpiry function should use the reconciler's
nil-safe emitter and surface TTL parse failures: replace the direct
r.Recorder.Eventf call in checkExpiry with r.emitEventf(...) (same message and
args) so it won't NPE when Recorder is nil, and when resolveEffectiveTTL returns
an error emit a Warning event via r.emitEventf(...) that includes the parse
error (and optionally set a status condition) instead of silently skipping
expiry; refer to the checkExpiry, resolveEffectiveTTL, and emitEventf symbols to
locate the changes.
---
Outside diff comments:
In `@internal/controller/imagebuild/controller.go`:
- Around line 142-172: checkExpiry's ctrl.Result (RequeueAfter) is being dropped
when expired==false, so expiry requeues never fire; capture the result returned
by checkExpiry (e.g. expiryResult, expired, err := r.checkExpiry(...)) and
propagate its RequeueAfter when appropriate instead of discarding it: call
checkExpiry first, return immediately if expired or err!=nil as before, but for
non-expired cases keep expiryResult and when returning from the phase switch
(including terminal phases handled by handleCompletedState,
phaseFailed/cleanupTransientSecrets, and other handlers like
handleInitialState/handleUploadingState/handleBuildingState/handlePushingState/handleFlashingState)
ensure you return the shorter/min RequeueAfter between the handler's result and
expiryResult (or expiryResult if the handler returns no requeue) so the
controller wakes at ExpiresAt; alternatively, if keeping early returns, ensure
terminal-phase returns honor expiryResult.RequeueAfter > 0 and requeue
accordingly.
---
Nitpick comments:
In `@internal/controller/imagebuild/controller.go`:
- Around line 52-73: Move the eventReasonBuildExpired constant out of the block
with maxK8sNameLength and colocate it with the other eventReason* constants;
specifically, remove eventReasonBuildExpired from the block containing
maxK8sNameLength and add it into the const block that contains
eventReasonPhaseChanged, eventReasonPipelineRunReady, eventReasonUploadPodReady,
etc., so all eventReason* constants (including eventReasonBuildExpired) are
grouped together for discoverability.
- Around line 456-480: resolveEffectiveTTL currently swallows any r.Get(...)
error and falls back to the hardcoded DefaultBuildTTL; change the error handling
so non-NotFound API errors are returned to the caller (so the reconcile can
requeue) while only treating NotFound as "use default". Concretely: after
calling r.Get(ctx, ..., operatorConfig) if err != nil then if
!apierrors.IsNotFound(err) return 0, err else set ttlStr =
automotivev1alpha1.DefaultBuildTTL; if err == nil set ttlStr =
operatorConfig.Spec.OSBuilds.GetDefaultBuildTTL() (GetDefaultBuildTTL already
nil-checks the receiver). Import k8s.io/apimachinery/pkg/api/errors as apierrors
and update resolveEffectiveTTL accordingly.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: e0860e76-8b8b-4abb-a5f6-f9dd74f1010c
⛔ Files ignored due to path filters (2)
config/crd/bases/automotive.sdv.cloud.redhat.com_imagebuilds.yamlis excluded by!config/crd/bases/**config/crd/bases/automotive.sdv.cloud.redhat.com_operatorconfigs.yamlis excluded by!config/crd/bases/**
📒 Files selected for processing (11)
api/v1alpha1/imagebuild_types.goapi/v1alpha1/operatorconfig_types.goapi/v1alpha1/zz_generated.deepcopy.gocmd/caib/buildcmd/build.gocmd/caib/buildcmd/build_disk_test.gocmd/caib/image/image.gocmd/caib/main.gocmd/caib/runtime_wiring.gointernal/buildapi/server.gointernal/buildapi/types.gointernal/controller/imagebuild/controller.go
| // TTL is the time-to-live for this build. The build is automatically deleted | ||
| // after this duration past its completion (or creation if still in progress). | ||
| // Uses Go duration format (e.g. "24h", "72h", "168h"). | ||
| // Empty uses the OperatorConfig default. Set to "0" to disable expiry. | ||
| // +optional | ||
| TTL string `json:"ttl,omitempty"` |
There was a problem hiding this comment.
🛠️ Refactor suggestion | 🟠 Major
Reminder: regenerate CRDs/DeepCopy after type changes.
The PR checklist shows make manifests generate is unchecked. Adding Spec.TTL and Status.ExpiresAt requires regenerating both the CRD manifests (for validation, additionalPrinterColumns, etc.) and zz_generated.deepcopy.go so ImageBuildStatus.ExpiresAt (pointer type) is correctly deep-copied. Please ensure make generate manifests is run and committed.
As per coding guidelines: "After modifying types in api/v1alpha1/, run make generate manifests to regenerate DeepCopy methods and CRDs".
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@api/v1alpha1/imagebuild_types.go` around lines 80 - 85, You added a new field
Spec.TTL and Status.ExpiresAt but did not regenerate code/CRDs; run the
generator to update deepcopy and CRD manifests: execute "make generate
manifests" (or the project equivalent) to regenerate zz_generated.deepcopy.go
and the CRD YAMLs so ImageBuildStatus.ExpiresAt (pointer) is deep-copied and the
CRD validation/printer columns include Spec.TTL, then commit the generated
files.
🧩 Analysis chain
🏁 Script executed:
# Check the TTL-related code in the controller
fd 'controller.go' api/v1alpha1 internal/controllerRepository: centos-automotive-suite/automotive-dev-operator
Length of output: 423
🏁 Script executed:
# Search for DefaultBuildTTL and DefaultBuildTimeoutMinutes definitions
rg 'DefaultBuildTTL|DefaultBuildTimeoutMinutes' --type goRepository: centos-automotive-suite/automotive-dev-operator
Length of output: 1471
🏁 Script executed:
# Look at the controller.go file around lines 425-430
rg -n 'TTL' internal/controller --type go -B 3 -A 3Repository: centos-automotive-suite/automotive-dev-operator
Length of output: 3777
🏁 Script executed:
# Check the imagebuild_types.go file for the TTL field
fd 'imagebuild_types.go' -type fRepository: centos-automotive-suite/automotive-dev-operator
Length of output: 261
🏁 Script executed:
# Read the controller.go file to see the complete checkExpiry implementation
cat -n internal/controller/imagebuild/controller.go | sed -n '391,454p'Repository: centos-automotive-suite/automotive-dev-operator
Length of output: 2551
🏁 Script executed:
# Also check around the lines mentioned (425-430)
cat -n internal/controller/imagebuild/controller.go | sed -n '415,440p'Repository: centos-automotive-suite/automotive-dev-operator
Length of output: 957
🏁 Script executed:
# Look at ImageBuild status phases
rg -A 20 'type ImageBuildStatus' api/v1alpha1/imagebuild_types.goRepository: centos-automotive-suite/automotive-dev-operator
Length of output: 910
🏁 Script executed:
# Search for test files related to TTL logic
find . -name '*test*.go' -type f | xargs grep -l 'TTL\|checkExpiry' 2>/dev/nullRepository: centos-automotive-suite/automotive-dev-operator
Length of output: 129
🏁 Script executed:
# Look for any phase-related logic in the reconciler
rg 'Phase|Status' internal/controller/imagebuild/controller.go | head -20Repository: centos-automotive-suite/automotive-dev-operator
Length of output: 1223
🏁 Script executed:
# Check the current doc comment in imagebuild_types.go more carefully
cat -n api/v1alpha1/imagebuild_types.go | sed -n '75,90p'Repository: centos-automotive-suite/automotive-dev-operator
Length of output: 961
🏁 Script executed:
# Look for any NoExpireAnnotation usage or documentation
rg 'NoExpireAnnotation|no.*expire|disable.*expiry' --type go -B 2 -A 2Repository: centos-automotive-suite/automotive-dev-operator
Length of output: 8935
🏁 Script executed:
# Check if there are any integration tests or examples for TTL behavior
find . -name '*.yaml' -o -name '*.yml' | xargs grep -l 'ttl' 2>/dev/null | head -5Repository: centos-automotive-suite/automotive-dev-operator
Length of output: 220
🏁 Script executed:
# Check the DefaultBuildTTL comment in operatorconfig_types.go to see if intent is documented
cat -n api/v1alpha1/operatorconfig_types.go | grep -A 5 'DefaultBuildTTL ='Repository: centos-automotive-suite/automotive-dev-operator
Length of output: 396
🏁 Script executed:
# Check if there are any tests that verify TTL behavior with different phases
rg -B 5 -A 10 'checkExpiry|TTL.*in.*progress|in.*progress.*TTL' internal/controller/imagebuild --type goRepository: centos-automotive-suite/automotive-dev-operator
Length of output: 2729
🏁 Script executed:
# Look at the reconcile loop to see when checkExpiry is called (in what phase)
rg -B 10 'checkExpiry' internal/controller/imagebuild/controller.go | head -40Repository: centos-automotive-suite/automotive-dev-operator
Length of output: 1013
🏁 Script executed:
# Check the full DefaultBuildTTL comment for context
cat -n api/v1alpha1/operatorconfig_types.go | sed -n '60,75p'Repository: centos-automotive-suite/automotive-dev-operator
Length of output: 909
TTL can delete builds in any phase, including in-progress ones — clarify intent and update documentation.
The code allows deletion of builds in Pending/Uploading/Building/Pushing/Flashing phases if their CreationTimestamp + TTL elapses (since CompletionTime is nil). The constant comment in api/v1alpha1/operatorconfig_types.go (line 63) states "for completed/failed builds," but the actual checkExpiry logic (lines 425-430 of internal/controller/imagebuild/controller.go) applies TTL deletion to any phase without filtering. This creates a documentation-code mismatch.
Either confirm this behavior is intentional and update the doc comments accordingly, or restrict expiry to terminal phases (Completed/Failed) only. If intentional, also note that user-set short TTLs combined with long artifact uploads could trigger mid-flight deletion despite the 24h default.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@api/v1alpha1/imagebuild_types.go` around lines 80 - 85, The TTL behavior is
inconsistent with its documentation: either make expiry only apply to terminal
builds or document that TTL can delete in-flight builds; to fix, decide which
behavior you want and implement one of two changes — (A) restrict the expiry
check in checkExpiry (internal/controller/imagebuild/controller.go) to only
consider builds with CompletionTime set or Phase == "Completed"/"Failed" (so
in-progress phases are excluded), or (B) if mid-flight expiry is intended,
update the TTL field comment in ImageBuild (api/v1alpha1/imagebuild_types.go)
and the related OperatorConfig comment (api/v1alpha1/operatorconfig_types.go) to
state TTL applies to any phase and warn about short user-set TTLs interacting
with long uploads; ensure the updated docs mention the "0" disable sentinel and
the potential for in-progress deletion.
d3b6481 to
19d95b5
Compare
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (1)
internal/buildapi/server.go (1)
1524-1528:⚠️ Potential issue | 🟠 MajorFail closed on invalid
maxBuildTTLvalues.
parseErris ignored, and parsed non-positive values like-1h/0hcan poison the effective TTL policy. Reject malformed or non-positive max values, keeping only exact"0"as the “no maximum” sentinel.Proposed fix
if operatorCfg != nil && operatorCfg.Spec.OSBuilds != nil { if maxStr := operatorCfg.Spec.OSBuilds.GetMaxBuildTTL(); maxStr != "" && maxStr != "0" { - if maxDur, parseErr := time.ParseDuration(maxStr); parseErr == nil && dur > maxDur { + maxDur, parseErr := time.ParseDuration(maxStr) + if parseErr != nil { + return "", fmt.Errorf("invalid maxBuildTTL %q: %w", maxStr, parseErr) + } + if maxDur <= 0 { + return "", fmt.Errorf("maxBuildTTL must be positive or \"0\"") + } + if dur > maxDur { return maxStr, nil } } }After the fix, add coverage for malformed, negative, and zero-duration max policies:
#!/bin/bash # Expect tests covering invalid/negative/zero-form MaxBuildTTL policy values. rg -n -C2 'MaxBuildTTL|resolveAndClampTTL|invalid maxBuildTTL|negative|max.*0h' internal/buildapi --type go🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/buildapi/server.go` around lines 1524 - 1528, The current code ignores parseErr and treats parsed non-positive durations as valid; change the logic in the TTL resolution path (where operatorCfg.Spec.OSBuilds.GetMaxBuildTTL() is read and time.ParseDuration is called) so that any parse error or a parsed duration <= 0 (except the exact string "0" which remains the “no maximum” sentinel) is rejected and does not override the effective TTL; specifically, if GetMaxBuildTTL() != "" then first check if the string == "0" (allow as sentinel), otherwise call time.ParseDuration and if parseErr != nil or parsedDur <= 0 return an error (or skip using the max) instead of accepting it; update the code paths that return maxStr (the branch that currently returns when maxDur < dur) to only do so for valid positive parsed durations, and add unit tests covering malformed values, negative durations (e.g. "-1h"), and zero-duration policies.
🧹 Nitpick comments (2)
internal/controller/imagebuild/expiry_test.go (1)
88-107: Test gap:NoExpireAnnotationbranch doesn't cover clearing an existingstatus.ExpiresAt.
checkExpiry(controller.go Lines 406-413) has a branch that callsclearExpiresAtwhen the no-expire annotation is present andstatus.ExpiresAt != nil. This test seeds an ImageBuild without a pre-existingExpiresAt, so theclearExpiresAtpath is never exercised. Consider adding a variant that pre-setsstatus.ExpiresAtand asserts it is reset tonilaftercheckExpiry. The same gap applies toTestCheckExpiry_TTLZeroDisablesExpiry(Lines 109-123) for thettl == 0clear branch.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/controller/imagebuild/expiry_test.go` around lines 88 - 107, Add test cases that exercise the branch where an existing status.ExpiresAt is cleared: modify TestCheckExpiry_NoExpireAnnotation to seed the ImageBuild's status.ExpiresAt with a non-nil timestamp before calling expiryReconciler.checkExpiry (so the controller path that calls clearExpiresAt is exercised), then after calling r.checkExpiry assert that the returned object in the API has status.ExpiresAt == nil; repeat the same pattern for TestCheckExpiry_TTLZeroDisablesExpiry to cover the ttl==0 branch that should also call clearExpiresAt. Ensure you reference the ImageBuild under test (e.g., the "pinned-build" instance) and use the existing r.checkExpiry and r.Get assertions to validate the ExpiresAt field is reset.internal/controller/imagebuild/controller.go (1)
513-525:clearExpiresAtissues an unconditionalStatus().Updateeven whenExpiresAtis alreadynil.Both callers in
checkExpiryalready gate onimageBuild.Status.ExpiresAt != nilbefore invokingclearExpiresAt(Lines 407, 421), so this is currently fine — but the helper itself doesn't re-check the fresh object. If a future caller skips the guard, this writes on every reconcile. Consider defensively no-op'ing whenfresh.Status.ExpiresAt == nilafter the GET:♻️ Suggested tightening
if err := r.Get(ctx, types.NamespacedName{ Name: imageBuild.Name, Namespace: imageBuild.Namespace, }, fresh); err != nil { return err } + if fresh.Status.ExpiresAt == nil { + return nil + } fresh.Status.ExpiresAt = nil return r.Status().Update(ctx, fresh)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/controller/imagebuild/controller.go` around lines 513 - 525, The clearExpiresAt helper currently always calls r.Status().Update even if the fresh ImageBuild already has Status.ExpiresAt == nil; after fetching fresh in clearExpiresAt (function ImageBuildReconciler.clearExpiresAt) add a defensive no-op: if fresh.Status.ExpiresAt == nil return nil to avoid an unnecessary status update (this preserves existing callers in checkExpiry but prevents future callers from causing a redundant write).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@internal/controller/imagebuild/controller.go`:
- Around line 492-510: The ensureExpiresAt function is comparing a
high-precision expiresAt against the stored metav1.Time which is
second-precision, causing unnecessary Status().Update calls; fix it by
truncating expiresAt to API storage precision using metav1.NewTime(expiresAt)
and compare/set against that truncated time (use t := metav1.NewTime(expiresAt)
and compare imageBuild.Status.ExpiresAt.Time.Equal(t.Time) or set
fresh.Status.ExpiresAt = &t before Update), and consider switching to
client.MergeFrom patch semantics around r.Get / r.Status().Update to avoid
conflicts with concurrent status writers in
handleBuildingState/checkBuildProgress.
---
Duplicate comments:
In `@internal/buildapi/server.go`:
- Around line 1524-1528: The current code ignores parseErr and treats parsed
non-positive durations as valid; change the logic in the TTL resolution path
(where operatorCfg.Spec.OSBuilds.GetMaxBuildTTL() is read and time.ParseDuration
is called) so that any parse error or a parsed duration <= 0 (except the exact
string "0" which remains the “no maximum” sentinel) is rejected and does not
override the effective TTL; specifically, if GetMaxBuildTTL() != "" then first
check if the string == "0" (allow as sentinel), otherwise call
time.ParseDuration and if parseErr != nil or parsedDur <= 0 return an error (or
skip using the max) instead of accepting it; update the code paths that return
maxStr (the branch that currently returns when maxDur < dur) to only do so for
valid positive parsed durations, and add unit tests covering malformed values,
negative durations (e.g. "-1h"), and zero-duration policies.
---
Nitpick comments:
In `@internal/controller/imagebuild/controller.go`:
- Around line 513-525: The clearExpiresAt helper currently always calls
r.Status().Update even if the fresh ImageBuild already has Status.ExpiresAt ==
nil; after fetching fresh in clearExpiresAt (function
ImageBuildReconciler.clearExpiresAt) add a defensive no-op: if
fresh.Status.ExpiresAt == nil return nil to avoid an unnecessary status update
(this preserves existing callers in checkExpiry but prevents future callers from
causing a redundant write).
In `@internal/controller/imagebuild/expiry_test.go`:
- Around line 88-107: Add test cases that exercise the branch where an existing
status.ExpiresAt is cleared: modify TestCheckExpiry_NoExpireAnnotation to seed
the ImageBuild's status.ExpiresAt with a non-nil timestamp before calling
expiryReconciler.checkExpiry (so the controller path that calls clearExpiresAt
is exercised), then after calling r.checkExpiry assert that the returned object
in the API has status.ExpiresAt == nil; repeat the same pattern for
TestCheckExpiry_TTLZeroDisablesExpiry to cover the ttl==0 branch that should
also call clearExpiresAt. Ensure you reference the ImageBuild under test (e.g.,
the "pinned-build" instance) and use the existing r.checkExpiry and r.Get
assertions to validate the ExpiresAt field is reset.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 9871d6c9-b2b7-43f0-b8ed-0bffac21f2ec
⛔ Files ignored due to path filters (2)
config/crd/bases/automotive.sdv.cloud.redhat.com_imagebuilds.yamlis excluded by!config/crd/bases/**config/crd/bases/automotive.sdv.cloud.redhat.com_operatorconfigs.yamlis excluded by!config/crd/bases/**
📒 Files selected for processing (15)
api/v1alpha1/imagebuild_types.goapi/v1alpha1/operatorconfig_types.goapi/v1alpha1/zz_generated.deepcopy.gocmd/caib/buildcmd/build.gocmd/caib/buildcmd/build_disk_test.gocmd/caib/image/image.gocmd/caib/main.gocmd/caib/runtime_wiring.godocs/openapi.yamlinternal/buildapi/openapi.yamlinternal/buildapi/server.gointernal/buildapi/ttl_test.gointernal/buildapi/types.gointernal/controller/imagebuild/controller.gointernal/controller/imagebuild/expiry_test.go
✅ Files skipped from review due to trivial changes (2)
- cmd/caib/main.go
- internal/buildapi/types.go
🚧 Files skipped from review as they are similar to previous changes (3)
- api/v1alpha1/zz_generated.deepcopy.go
- cmd/caib/buildcmd/build_disk_test.go
- cmd/caib/buildcmd/build.go
4906efe to
09a75a9
Compare
There was a problem hiding this comment.
Actionable comments posted: 3
♻️ Duplicate comments (1)
internal/buildapi/server.go (1)
1598-1617:⚠️ Potential issue | 🟠 MajorEnforce
maxBuildTTLfor no-expire requests and malformed policy.Returning
"0"before loadingOperatorConfiglets users disable expiry even whenmaxBuildTTLis configured. Also, a malformedmaxBuildTTLcurrently disables clamping silently becauseparseErris ignored.Proposed fix
- if requestedTTL == "" || requestedTTL == "0" { + if requestedTTL == "" { return requestedTTL, nil } - dur, err := time.ParseDuration(requestedTTL) - if err != nil { - return "", fmt.Errorf("invalid TTL %q: %w", requestedTTL, err) - } - if dur < 0 { - return "", fmt.Errorf("TTL must not be negative") + requestedNoExpire := requestedTTL == "0" + var dur time.Duration + if !requestedNoExpire { + var err error + dur, err = time.ParseDuration(requestedTTL) + if err != nil { + return "", fmt.Errorf("invalid TTL %q: %w", requestedTTL, err) + } + if dur < 0 { + return "", fmt.Errorf("TTL must not be negative") + } } operatorCfg, cfgErr := loadOperatorConfigFn(ctx, k8sClient, namespace) if cfgErr != nil && !k8serrors.IsNotFound(cfgErr) { return "", fmt.Errorf("failed to load OperatorConfig: %w", cfgErr) } if operatorCfg != nil && operatorCfg.Spec.OSBuilds != nil { if maxStr := operatorCfg.Spec.OSBuilds.GetMaxBuildTTL(); maxStr != "" && maxStr != "0" { - if maxDur, parseErr := time.ParseDuration(maxStr); parseErr == nil && dur > maxDur { + maxDur, parseErr := time.ParseDuration(maxStr) + if parseErr != nil { + return "", fmt.Errorf("invalid maxBuildTTL %q: %w", maxStr, parseErr) + } + if requestedNoExpire || dur > maxDur { return maxStr, nil } }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/buildapi/server.go` around lines 1598 - 1617, The current logic returns early for requestedTTL == "" or "0" and ignores parse errors for operatorCfg.Spec.OSBuilds.GetMaxBuildTTL(), which allows bypassing maxBuildTTL and silently disables clamping; change the flow so you always load operator config (call loadOperatorConfigFn with ctx, k8sClient, namespace) before deciding to return for a no-expire "0" or blank TTL, and enforce the operator max when present: if operatorCfg.Spec.OSBuilds.GetMaxBuildTTL() yields a non-empty/non-"0" maxStr, parse it (using time.ParseDuration) and if parsing fails return an error (do not ignore parseErr), and if requestedTTL == "0" or parsed dur > maxDur return maxStr; otherwise proceed with the parsed requestedTTL logic and keep the existing negative-duration check—use the symbols requestedTTL, loadOperatorConfigFn, operatorCfg.Spec.OSBuilds.GetMaxBuildTTL(), time.ParseDuration to locate the code to change.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@internal/buildapi/server.go`:
- Around line 930-937: When cancelling a build you only flip the CR to
phaseCancelled (build.Status.Phase = phaseCancelled) but do not tear down upload
resources for builds that were in the Uploading state; add logic around the
cancellation block to detect if the build was in "Uploading" (check the previous
phase or build.Status/Spec fields that record the upload) and explicitly delete
the upload pod/related secrets before (or immediately after) updating status —
e.g., find the upload pod by the build's upload-related identifier (pod
name/labels stored on the build, or build.Spec.Upload fields) and call the k8s
client Delete (k8sClient.Delete(ctx, pod) or clientset equivalent);
alternatively, invoke the controller's cancelled-upload handler if one is
available instead of manual deletion. Ensure this runs in the same cancellation
path where you set phaseCancelled and call k8sClient.Status().Update so uploads
are torn down immediately.
- Around line 896-904: The cancellation check in the handler treats an empty
initial phase as non-cancellable, causing newly created ImageBuilds (created by
createBuild with no status) to return 409 until the controller reconciles;
update the switch on build.Status.Phase (currently comparing against
phasePending, phaseUploading, phaseBuilding) to also allow the zero/empty phase
("" or whatever the controller treats as initial) as cancellable so new builds
can be cancelled before the first reconcile; reference build.Status.Phase,
phasePending, phaseUploading, phaseBuilding and createBuild when locating the
logic to change.
In `@internal/controller/imagebuild/controller.go`:
- Around line 481-485: The TTL parsing currently returns
time.ParseDuration(ttlStr) and treats "0" specially but doesn't reject negative
durations; update the TTL parsing function that reads ttlStr to parse the
duration (using time.ParseDuration), then if the parsed duration is negative
return an error (same error semantics used by the API) so negative TTLs from
ImageBuild/OperatorConfig are rejected before being applied (i.e., check d < 0
and return a descriptive error instead of returning the negative duration).
---
Duplicate comments:
In `@internal/buildapi/server.go`:
- Around line 1598-1617: The current logic returns early for requestedTTL == ""
or "0" and ignores parse errors for operatorCfg.Spec.OSBuilds.GetMaxBuildTTL(),
which allows bypassing maxBuildTTL and silently disables clamping; change the
flow so you always load operator config (call loadOperatorConfigFn with ctx,
k8sClient, namespace) before deciding to return for a no-expire "0" or blank
TTL, and enforce the operator max when present: if
operatorCfg.Spec.OSBuilds.GetMaxBuildTTL() yields a non-empty/non-"0" maxStr,
parse it (using time.ParseDuration) and if parsing fails return an error (do not
ignore parseErr), and if requestedTTL == "0" or parsed dur > maxDur return
maxStr; otherwise proceed with the parsed requestedTTL logic and keep the
existing negative-duration check—use the symbols requestedTTL,
loadOperatorConfigFn, operatorCfg.Spec.OSBuilds.GetMaxBuildTTL(),
time.ParseDuration to locate the code to change.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: b80bdf1f-679a-4a4a-8a5d-2e86b60e2d2b
⛔ Files ignored due to path filters (2)
config/crd/bases/automotive.sdv.cloud.redhat.com_imagebuilds.yamlis excluded by!config/crd/bases/**config/crd/bases/automotive.sdv.cloud.redhat.com_operatorconfigs.yamlis excluded by!config/crd/bases/**
📒 Files selected for processing (19)
api/v1alpha1/imagebuild_types.goapi/v1alpha1/operatorconfig_types.goapi/v1alpha1/zz_generated.deepcopy.gocmd/caib/buildcmd/build.gocmd/caib/buildcmd/build_disk_test.gocmd/caib/buildcmd/logs.gocmd/caib/image/image.gocmd/caib/main.gocmd/caib/runtime_wiring.godocs/openapi.yamlinternal/buildapi/client/client.gointernal/buildapi/container_builds.gointernal/buildapi/openapi.yamlinternal/buildapi/server.gointernal/buildapi/server_test.gointernal/buildapi/ttl_test.gointernal/buildapi/types.gointernal/controller/imagebuild/controller.gointernal/controller/imagebuild/expiry_test.go
✅ Files skipped from review due to trivial changes (2)
- cmd/caib/main.go
- internal/buildapi/openapi.yaml
🚧 Files skipped from review as they are similar to previous changes (5)
- cmd/caib/buildcmd/build_disk_test.go
- internal/buildapi/types.go
- api/v1alpha1/zz_generated.deepcopy.go
- internal/buildapi/ttl_test.go
- api/v1alpha1/operatorconfig_types.go
09a75a9 to
7b8d7b8
Compare
There was a problem hiding this comment.
♻️ Duplicate comments (1)
internal/buildapi/server.go (1)
1614-1617:⚠️ Potential issue | 🟠 MajorFail closed on malformed
maxBuildTTL.Line 1615 silently ignores
time.ParseDuration(maxStr)errors, so a bad OperatorConfig value disables the max-TTL policy for user requests. Return the parse error instead, and reject non-positive max values unless the literal"0"no-limit sentinel was used.Proposed fix
if operatorCfg != nil && operatorCfg.Spec.OSBuilds != nil { if maxStr := operatorCfg.Spec.OSBuilds.GetMaxBuildTTL(); maxStr != "" && maxStr != "0" { - if maxDur, parseErr := time.ParseDuration(maxStr); parseErr == nil && dur > maxDur { + maxDur, parseErr := time.ParseDuration(maxStr) + if parseErr != nil { + return "", fmt.Errorf("invalid maxBuildTTL %q: %w", maxStr, parseErr) + } + if maxDur <= 0 { + return "", fmt.Errorf("maxBuildTTL must be positive or \"0\"") + } + if dur > maxDur { return maxStr, nil } } }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/buildapi/server.go` around lines 1614 - 1617, The code currently swallows errors from time.ParseDuration when reading operatorCfg.Spec.OSBuilds.GetMaxBuildTTL(); update the logic in the block that checks GetMaxBuildTTL() so that parseErr is returned instead of ignored, and validate the parsed duration (maxDur) is positive — treat non-positive durations as an error unless the literal "0" sentinel was used to indicate no limit; specifically modify the branch that calls time.ParseDuration(maxStr) (the GetMaxBuildTTL check) to return parseErr on parse failure and to reject maxDur <= 0 with an appropriate error.
🧹 Nitpick comments (4)
internal/controller/imagebuild/expiry_test.go (4)
53-70: Consider asserting on the returnedctrl.Resultfor expired paths.
TestCheckExpiry_ExpiredBuildandTestCheckExpiry_FailedBuildExpiresdiscard theresultvalue. PercheckExpiry(controller.go:401-460), the expired path returns an emptyctrl.Result{}— worth assertingresult.RequeueAfter == 0so a future regression that accidentally requeues a deleted build is caught.Also applies to: 139-150
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/controller/imagebuild/expiry_test.go` around lines 53 - 70, Update the tests TestCheckExpiry_ExpiredBuild and TestCheckExpiry_FailedBuildExpires to also assert the returned ctrl.Result from checkExpiry: capture the first return value (result) and assert result.RequeueAfter == 0 (and/or result == ctrl.Result{}), referencing the checkExpiry call in the tests so they explicitly verify the expired path returns an empty/no-requeue result; keep the existing expired boolean and deletion assertions intact.
125-137: Also assert the in-progress build is actually deleted.This test verifies the
expiredflag but not thatDeleteactually removed the object, unlikeTestCheckExpiry_ExpiredBuild. Adding a follow-upr.Get(...)and expecting NotFound would close the loop and protect against a regression where the in-progress branch resolves expiry but skips deletion.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/controller/imagebuild/expiry_test.go` around lines 125 - 137, After asserting expired in TestCheckExpiry_InProgressUsesCreationTimestamp (which calls r.checkExpiry), add a follow-up retrieval using the reconciler's client (r.Get) for the same ImageBuild object and assert that the call returns a NotFound error (use apierrors.IsNotFound) to verify the object was actually deleted; this mirrors TestCheckExpiry_ExpiredBuild and prevents regression where expiry is reported but Delete is skipped.
31-51: Confusing parameter semantics innewTestImageBuild.
completedAgois used to compute bothCreationTimestamp(now - completedAgo - 1h) andCompletionTime(now - completedAgo), which couples two unrelated anchors and makes the test fixtures hard to reason about. It also makes thephaseBuildingcase inTestCheckExpiry_InProgressUsesCreationTimestamp(Line 127) immediately overwriteCreationTimestamp, indicating the helper doesn't fit that case well.Consider splitting into two helpers (e.g.
newCompletedBuild(name, ttl, completedAgo)andnewInProgressBuild(name, ttl, createdAgo)) or accepting both anchors explicitly.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/controller/imagebuild/expiry_test.go` around lines 31 - 51, The helper newTestImageBuild conflates creation and completion anchors by using completedAgo for both CreationTimestamp and CompletionTime, making fixtures confusing and unsuitable for in-progress cases; split it into two focused helpers—e.g. newCompletedBuild(name, ttl, completedAgo) that sets CreationTimestamp relative to completedAgo minus a small delta and sets CompletionTime to now - completedAgo, and newInProgressBuild(name, ttl, createdAgo) that sets CreationTimestamp to now - createdAgo and leaves CompletionTime nil (or accept both createdAgo and completedAgo explicitly if needed); update tests such as TestCheckExpiry_InProgressUsesCreationTimestamp to use the new helpers (replace usages of newTestImageBuild) so creation and completion times are set independently.
178-295: Optional: deduplicate the fourresolveEffectiveTTLtests.The four
TestResolveEffectiveTTL_*cases share nearly identical scaffolding (scheme + reconciler + optionalOperatorConfig); a small table-driven test would cut ~80 lines and make the precedence rules (spec → OperatorConfig → hardcoded → "0" disables) easier to read at a glance.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/controller/imagebuild/expiry_test.go` around lines 178 - 295, The tests TestResolveEffectiveTTL_SpecOverride, TestResolveEffectiveTTL_OperatorConfigDefault, TestResolveEffectiveTTL_HardcodedFallback, and TestResolveEffectiveTTL_ZeroDisables/OperatorConfigZeroDisables duplicate scaffolding; refactor them into a single table-driven test that iterates cases describing input IB TTL, optional OperatorConfig.DefaultBuildTTL, and expected time.Duration, calling r.resolveEffectiveTTL for each. Reuse newTestSchemeWithTekton, newTestImageBuild and ImageBuildReconciler setup in a loop (create the fake client with or without the OperatorConfig per case) and assert ttl/err per table row to capture precedence (spec → OperatorConfig → hardcoded → "0" disables) while removing the four near-identical test functions.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In `@internal/buildapi/server.go`:
- Around line 1614-1617: The code currently swallows errors from
time.ParseDuration when reading operatorCfg.Spec.OSBuilds.GetMaxBuildTTL();
update the logic in the block that checks GetMaxBuildTTL() so that parseErr is
returned instead of ignored, and validate the parsed duration (maxDur) is
positive — treat non-positive durations as an error unless the literal "0"
sentinel was used to indicate no limit; specifically modify the branch that
calls time.ParseDuration(maxStr) (the GetMaxBuildTTL check) to return parseErr
on parse failure and to reject maxDur <= 0 with an appropriate error.
---
Nitpick comments:
In `@internal/controller/imagebuild/expiry_test.go`:
- Around line 53-70: Update the tests TestCheckExpiry_ExpiredBuild and
TestCheckExpiry_FailedBuildExpires to also assert the returned ctrl.Result from
checkExpiry: capture the first return value (result) and assert
result.RequeueAfter == 0 (and/or result == ctrl.Result{}), referencing the
checkExpiry call in the tests so they explicitly verify the expired path returns
an empty/no-requeue result; keep the existing expired boolean and deletion
assertions intact.
- Around line 125-137: After asserting expired in
TestCheckExpiry_InProgressUsesCreationTimestamp (which calls r.checkExpiry), add
a follow-up retrieval using the reconciler's client (r.Get) for the same
ImageBuild object and assert that the call returns a NotFound error (use
apierrors.IsNotFound) to verify the object was actually deleted; this mirrors
TestCheckExpiry_ExpiredBuild and prevents regression where expiry is reported
but Delete is skipped.
- Around line 31-51: The helper newTestImageBuild conflates creation and
completion anchors by using completedAgo for both CreationTimestamp and
CompletionTime, making fixtures confusing and unsuitable for in-progress cases;
split it into two focused helpers—e.g. newCompletedBuild(name, ttl,
completedAgo) that sets CreationTimestamp relative to completedAgo minus a small
delta and sets CompletionTime to now - completedAgo, and
newInProgressBuild(name, ttl, createdAgo) that sets CreationTimestamp to now -
createdAgo and leaves CompletionTime nil (or accept both createdAgo and
completedAgo explicitly if needed); update tests such as
TestCheckExpiry_InProgressUsesCreationTimestamp to use the new helpers (replace
usages of newTestImageBuild) so creation and completion times are set
independently.
- Around line 178-295: The tests TestResolveEffectiveTTL_SpecOverride,
TestResolveEffectiveTTL_OperatorConfigDefault,
TestResolveEffectiveTTL_HardcodedFallback, and
TestResolveEffectiveTTL_ZeroDisables/OperatorConfigZeroDisables duplicate
scaffolding; refactor them into a single table-driven test that iterates cases
describing input IB TTL, optional OperatorConfig.DefaultBuildTTL, and expected
time.Duration, calling r.resolveEffectiveTTL for each. Reuse
newTestSchemeWithTekton, newTestImageBuild and ImageBuildReconciler setup in a
loop (create the fake client with or without the OperatorConfig per case) and
assert ttl/err per table row to capture precedence (spec → OperatorConfig →
hardcoded → "0" disables) while removing the four near-identical test functions.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 41cb26f6-68f6-42a8-b076-1937e74aba07
⛔ Files ignored due to path filters (2)
config/crd/bases/automotive.sdv.cloud.redhat.com_imagebuilds.yamlis excluded by!config/crd/bases/**config/crd/bases/automotive.sdv.cloud.redhat.com_operatorconfigs.yamlis excluded by!config/crd/bases/**
📒 Files selected for processing (15)
api/v1alpha1/imagebuild_types.goapi/v1alpha1/operatorconfig_types.goapi/v1alpha1/zz_generated.deepcopy.gocmd/caib/buildcmd/build.gocmd/caib/buildcmd/build_disk_test.gocmd/caib/image/image.gocmd/caib/main.gocmd/caib/runtime_wiring.godocs/openapi.yamlinternal/buildapi/openapi.yamlinternal/buildapi/server.gointernal/buildapi/ttl_test.gointernal/buildapi/types.gointernal/controller/imagebuild/controller.gointernal/controller/imagebuild/expiry_test.go
✅ Files skipped from review due to trivial changes (2)
- cmd/caib/main.go
- internal/buildapi/ttl_test.go
🚧 Files skipped from review as they are similar to previous changes (6)
- docs/openapi.yaml
- cmd/caib/buildcmd/build_disk_test.go
- internal/buildapi/openapi.yaml
- cmd/caib/buildcmd/build.go
- api/v1alpha1/operatorconfig_types.go
- internal/controller/imagebuild/controller.go
7b8d7b8 to
4ddd68b
Compare
There was a problem hiding this comment.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
internal/buildapi/server.go (1)
2101-2136:⚠️ Potential issue | 🟠 MajorValidate TTL before starting extra-repo side effects.
resolveExtraReposcan start HTTP servers in workspace pods before lines 2132-2136 reject an invalid or over-limit TTL. Move TTL validation ahead ofresolveExtraReposso bad requests fail without mutating cluster/workspace state.Proposed refactor
if err := applyBuildDefaults(&req); err != nil { c.JSON(http.StatusBadRequest, gin.H{"error": err.Error()}) return } + k8sClient, err := getClientFromRequest(c) + if err != nil { + c.JSON(http.StatusInternalServerError, gin.H{"error": fmt.Sprintf("k8s client error: %v", err)}) + return + } + + ctx := c.Request.Context() + namespace := resolveNamespace() + requestedBy := a.resolveRequester(c) + + effectiveTTL, ttlErr := resolveAndClampTTL(ctx, k8sClient, namespace, req.TTL) + if ttlErr != nil { + c.JSON(http.StatusBadRequest, gin.H{"error": ttlErr.Error()}) + return + } + // Resolve --extra-repo workspace:path pairs into extra_repos custom defines if len(req.ExtraRepos) > 0 { - k8sClientForRepos, err := getClientFromRequest(c) - if err != nil { - c.JSON(http.StatusInternalServerError, gin.H{"error": "failed to create kubernetes client"}) - return - } restCfgForRepos, err := getRESTConfigFromRequest(c) if err != nil { c.JSON(http.StatusInternalServerError, gin.H{"error": "failed to get kubernetes config"}) return } - if err := a.resolveExtraRepos(c.Request.Context(), k8sClientForRepos, restCfgForRepos, &req); err != nil { + if err := a.resolveExtraRepos(ctx, k8sClient, restCfgForRepos, &req); err != nil { c.JSON(http.StatusBadRequest, gin.H{"error": err.Error()}) return } } // Append a short random suffix to ensure unique names for parallel builds req.Name = fmt.Sprintf("%s-%s", req.Name, uuid.New().String()[:5]) - - k8sClient, err := getClientFromRequest(c) - if err != nil { - c.JSON(http.StatusInternalServerError, gin.H{"error": fmt.Sprintf("k8s client error: %v", err)}) - return - } - - ctx := c.Request.Context() - namespace := resolveNamespace() - requestedBy := a.resolveRequester(c) - - effectiveTTL, ttlErr := resolveAndClampTTL(ctx, k8sClient, namespace, req.TTL) - if ttlErr != nil { - c.JSON(http.StatusBadRequest, gin.H{"error": ttlErr.Error()}) - return - }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/buildapi/server.go` around lines 2101 - 2136, The TTL is validated too late — move the resolveAndClampTTL call (and its error handling) to occur before calling a.resolveExtraRepos so invalid/over-limit TTLs fail without mutating cluster state; obtain the k8s client/REST config once (reuse getClientFromRequest/getRESTConfigFromRequest results) or get a lightweight client for TTL validation if possible, then run effectiveTTL, ttlErr := resolveAndClampTTL(...) and return on ttlErr before invoking a.resolveExtraRepos(c.Request.Context(), k8sClientForRepos, restCfgForRepos, &req); keep the UUID suffix and downstream logic unchanged.
🧹 Nitpick comments (1)
api/v1alpha1/zz_generated.deepcopy.go (1)
676-679: Keep generated deepcopy artifacts out of the PR if this repo policy still applies.The
ExpiresAtpointer is deep-copied correctly, but this generated file should be omitted if the source type changes are regenerated during deploy/build as documented for this repo. Based on learnings, Generated files that are not committed should be ignored in PRs. Specifically, do not include api/v1alpha1/zz_generated.deepcopy.go.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@api/v1alpha1/zz_generated.deepcopy.go` around lines 676 - 679, This autogenerated deepcopy file (api/v1alpha1/zz_generated.deepcopy.go) containing DeepCopy logic for types like the ExpiresAt pointer was committed but must be omitted per repo policy; remove zz_generated.deepcopy.go from the PR (revert the commit that added/modified it), add the generated path (api/v1alpha1/zz_generated.deepcopy.go) to the repo ignore rules (e.g., .gitignore) or the project’s generated-files exclusion list, and ensure the canonical codegen step (the DeepCopy generator used by your build or make/generate script) is documented/run during CI/build so DeepCopy methods are regenerated rather than checked in.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@internal/buildapi/server.go`:
- Around line 1597-1628: resolveAndClampTTL currently returns early when
requestedTTL == "0", allowing no-expiry even if
OperatorConfig.spec.osBuilds.maxBuildTTL is set; change the logic in
resolveAndClampTTL to defer handling of the "0" (no-expiry) case until after
loading OperatorConfig via loadOperatorConfigFn, and if
operatorCfg.Spec.OSBuilds.GetMaxBuildTTL() yields a non-empty, non-"0" value
then treat requestedTTL "0" as disallowed (return an error stating no-expiry
exceeds the configured max); keep the existing duration parsing/validation for
non-zero values and retain the existing error messages for invalid or negative
durations.
In `@internal/controller/imagebuild/controller.go`:
- Around line 467-496: The current logic trusts imageBuild.Spec.GetTTL() without
applying the operator's MaxBuildTTL; load the OperatorConfig (same way as the
existing branch that handles empty ttlStr) even when ttlStr != "" to obtain
operatorConfig.Spec.OSBuilds.GetMaxBuildTTL()/MaxBuildTTL and enforce it: parse
ttlStr into ttl, parse operator max (if set) into maxTTL, then if ttl > maxTTL
set ttl = maxTTL (or return an error if your project's policy prefers); preserve
existing handling for "0" (return 0) and negative values (error), and reuse the
same variables/operatorConfig lookup used for DefaultBuildTTL to avoid
duplication.
---
Outside diff comments:
In `@internal/buildapi/server.go`:
- Around line 2101-2136: The TTL is validated too late — move the
resolveAndClampTTL call (and its error handling) to occur before calling
a.resolveExtraRepos so invalid/over-limit TTLs fail without mutating cluster
state; obtain the k8s client/REST config once (reuse
getClientFromRequest/getRESTConfigFromRequest results) or get a lightweight
client for TTL validation if possible, then run effectiveTTL, ttlErr :=
resolveAndClampTTL(...) and return on ttlErr before invoking
a.resolveExtraRepos(c.Request.Context(), k8sClientForRepos, restCfgForRepos,
&req); keep the UUID suffix and downstream logic unchanged.
---
Nitpick comments:
In `@api/v1alpha1/zz_generated.deepcopy.go`:
- Around line 676-679: This autogenerated deepcopy file
(api/v1alpha1/zz_generated.deepcopy.go) containing DeepCopy logic for types like
the ExpiresAt pointer was committed but must be omitted per repo policy; remove
zz_generated.deepcopy.go from the PR (revert the commit that added/modified it),
add the generated path (api/v1alpha1/zz_generated.deepcopy.go) to the repo
ignore rules (e.g., .gitignore) or the project’s generated-files exclusion list,
and ensure the canonical codegen step (the DeepCopy generator used by your build
or make/generate script) is documented/run during CI/build so DeepCopy methods
are regenerated rather than checked in.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: dc090379-5008-4b18-9ddb-38ca24786a66
⛔ Files ignored due to path filters (2)
config/crd/bases/automotive.sdv.cloud.redhat.com_imagebuilds.yamlis excluded by!config/crd/bases/**config/crd/bases/automotive.sdv.cloud.redhat.com_operatorconfigs.yamlis excluded by!config/crd/bases/**
📒 Files selected for processing (15)
api/v1alpha1/imagebuild_types.goapi/v1alpha1/operatorconfig_types.goapi/v1alpha1/zz_generated.deepcopy.gocmd/caib/buildcmd/build.gocmd/caib/buildcmd/build_disk_test.gocmd/caib/image/image.gocmd/caib/main.gocmd/caib/runtime_wiring.godocs/openapi.yamlinternal/buildapi/openapi.yamlinternal/buildapi/server.gointernal/buildapi/ttl_test.gointernal/buildapi/types.gointernal/controller/imagebuild/controller.gointernal/controller/imagebuild/expiry_test.go
✅ Files skipped from review due to trivial changes (6)
- cmd/caib/buildcmd/build_disk_test.go
- cmd/caib/main.go
- internal/buildapi/openapi.yaml
- docs/openapi.yaml
- api/v1alpha1/imagebuild_types.go
- internal/controller/imagebuild/expiry_test.go
1ca7117 to
1a75a8b
Compare
Builds will be expired by default, removing their ImageBuild and clearing up resources Signed-off-by: Benny Zlotnik <bzlotnik@redhat.com> Assisted-by: claude-opus-4.6
1a75a8b to
13902aa
Compare
Builds will be expired by default, removing their ImageBuild and clearing up resources
Summary
Related Issues
Type of Change
Testing
make test)make lint)make manifests generate)Summary by CodeRabbit
New Features
Configuration
Tests