Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
20 changes: 20 additions & 0 deletions api/v1alpha1/operatorconfig_types.go
Original file line number Diff line number Diff line change
Expand Up @@ -56,6 +56,9 @@ const (

// DefaultWorkspacePVCSize is the default PVC size for workspace storage
DefaultWorkspacePVCSize = "10Gi"

// DefaultAutoPauseTimeoutMinutes is the default idle timeout in minutes before a workspace is auto-paused
DefaultAutoPauseTimeoutMinutes int32 = 30
)

// ImagesConfig defines container image references used by the operator
Expand Down Expand Up @@ -351,6 +354,14 @@ type WorkspacesConfig struct {
// BuildCacheSize is the size of the PVC created for build cache persistence (default: "20Gi")
// +optional
BuildCacheSize string `json:"buildCacheSize,omitempty"`

// AutoPauseTimeoutMinutes is the cluster-wide default idle timeout in minutes before
// a workspace is automatically paused. Overridden per-workspace via spec.autoPauseTimeoutMinutes.
// Must be > 0. To disable auto-pause for a specific workspace, set its
// spec.autoPauseTimeoutMinutes to 0.
// Default: 30
// +optional
AutoPauseTimeoutMinutes int32 `json:"autoPauseTimeoutMinutes,omitempty"`
}

// GetToolchainImage returns the toolchain image, falling back to the default
Expand Down Expand Up @@ -406,6 +417,15 @@ func (c *WorkspacesConfig) GetTolerations() []corev1.Toleration {
return nil
}

// GetAutoPauseTimeoutMinutes returns the global auto-pause timeout in minutes.
// Returns DefaultAutoPauseTimeoutMinutes (30) when not configured.
func (c *WorkspacesConfig) GetAutoPauseTimeoutMinutes() int32 {
if c != nil && c.AutoPauseTimeoutMinutes > 0 {
return c.AutoPauseTimeoutMinutes
}
return DefaultAutoPauseTimeoutMinutes
}
Comment on lines +420 to +427

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Potential semantic inconsistency: 0 should disable auto-pause, not fall back to default.

The workspace spec documents that 0 = disable auto-pause, but this accessor returns DefaultAutoPauseTimeoutMinutes when the value is <= 0. This means setting WorkspacesConfig.AutoPauseTimeoutMinutes = 0 at the cluster level won't actually disable auto-pause globally.

If 0 should disable auto-pause cluster-wide (consistent with per-workspace behavior), consider:

Suggested fix
 func (c *WorkspacesConfig) GetAutoPauseTimeoutMinutes() int32 {
-	if c != nil && c.AutoPauseTimeoutMinutes > 0 {
+	if c != nil && c.AutoPauseTimeoutMinutes != 0 {
 		return c.AutoPauseTimeoutMinutes
 	}
 	return DefaultAutoPauseTimeoutMinutes
 }

Alternatively, if 0 at the cluster level should not disable but only per-workspace 0 should disable, document this distinction clearly.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@api/v1alpha1/operatorconfig_types.go` around lines 418 - 425, The accessor
GetAutoPauseTimeoutMinutes currently treats any non-positive
AutoPauseTimeoutMinutes as missing and returns DefaultAutoPauseTimeoutMinutes,
which prevents an explicit 0 from disabling auto-pause; update
GetAutoPauseTimeoutMinutes (in WorkspacesConfig) to first check for a non-nil
receiver and return c.AutoPauseTimeoutMinutes when it is explicitly set
(including 0 to indicate disabled), and only return
DefaultAutoPauseTimeoutMinutes when the config is nil or the field is unset
(e.g., negative/absent) so that 0 properly disables auto-pause cluster-wide.


// OperatorConfigSpec defines the desired state of OperatorConfig
type OperatorConfigSpec struct {
// OSBuilds defines the configuration for OS build operations
Expand Down
13 changes: 13 additions & 0 deletions api/v1alpha1/workspace_types.go
Original file line number Diff line number Diff line change
Expand Up @@ -64,6 +64,14 @@ type WorkspaceSpec struct {
// When true, the controller deletes the pod but preserves the PVC.
// +optional
Stopped bool `json:"stopped,omitempty"`

// AutoPauseTimeoutMinutes overrides the global auto-pause timeout for this workspace.
// nil = use global default from OperatorConfig (default: 30 minutes)
// 0 = disable auto-pause for this workspace
// >0 = custom timeout in minutes
// +kubebuilder:validation:Minimum=0
// +optional
AutoPauseTimeoutMinutes *int32 `json:"autoPauseTimeoutMinutes,omitempty"`
Comment thread
coderabbitai[bot] marked this conversation as resolved.
}

// WorkspaceStatus defines the observed state of a Workspace.
Expand All @@ -85,6 +93,11 @@ type WorkspaceStatus struct {
// Created lazily on first build referencing this workspace.
// +optional
BuildCachePVCName string `json:"buildCachePVCName,omitempty"`

// LastActivityTime is the last time activity was detected in the workspace pod.
// Used by the auto-pause controller to determine idle duration.
// +optional
LastActivityTime *metav1.Time `json:"lastActivityTime,omitempty"`
}

// +kubebuilder:object:root=true
Expand Down
11 changes: 10 additions & 1 deletion api/v1alpha1/zz_generated.deepcopy.go

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

15 changes: 15 additions & 0 deletions cmd/caib/workspace/workspace.go
Original file line number Diff line number Diff line change
Expand Up @@ -50,6 +50,9 @@ var (
// wait flag (shared by create and start)
waitForRunningFlag bool

// auto-pause flag
autoPauseTimeout int

// deploy flags
artifactMappings []string
)
Expand Down Expand Up @@ -126,6 +129,7 @@ Examples:
cmd.Flags().StringVar(&cpuRequest, "cpu", "", "CPU request/limit (e.g., \"1\", \"500m\")")
cmd.Flags().StringVar(&memoryRequest, "memory", "", "memory request/limit (e.g., \"2Gi\", \"512Mi\")")
cmd.Flags().BoolVar(&tmpfsBuildDir, "tmpfs", false, "mount a tmpfs volume at /tmp/build for faster compilation (uses RAM)")
cmd.Flags().IntVar(&autoPauseTimeout, "auto-pause-timeout", -1, "auto-pause timeout in minutes (0=disable, -1=use global default)")
cmd.Flags().BoolVarP(&waitForRunningFlag, "wait", "w", true, "wait for workspace to be running")

return cmd
Expand Down Expand Up @@ -290,6 +294,13 @@ func runCreate(_ *cobra.Command, args []string) {
Memory: memoryRequest,
TmpfsBuildDir: tmpfsBuildDir,
}
if autoPauseTimeout < -1 {
handleError(fmt.Errorf("--auto-pause-timeout must be >= -1"))
}
if autoPauseTimeout >= 0 {
v := int32(autoPauseTimeout)
req.AutoPauseTimeoutMinutes = &v
}
Comment thread
coderabbitai[bot] marked this conversation as resolved.

var resp *buildapitypes.WorkspaceResponse
err = caibcommon.ExecuteWithReauth(serverURL, &authToken, insecureSkipTLS, func(client *buildapiclient.Client) error {
Expand Down Expand Up @@ -381,6 +392,10 @@ func runShow(_ *cobra.Command, args []string) {
if ws.Age != "" {
fmt.Printf("Age: %s\n", ws.Age)
}
fmt.Printf("Auto-pause: %s\n", ws.AutoPauseTimeout)
if ws.LastActivity != "" {
fmt.Printf("Last active: %s\n", ws.LastActivity)
}
}

func runDelete(_ *cobra.Command, args []string) {
Expand Down
7 changes: 4 additions & 3 deletions cmd/main.go
Original file line number Diff line number Diff line change
Expand Up @@ -281,9 +281,10 @@ func main() {
}

workspaceReconciler := &workspace.Reconciler{
Client: mgr.GetClient(),
Scheme: mgr.GetScheme(),
Log: ctrl.Log.WithName("controllers").WithName("Workspace"),
Client: mgr.GetClient(),
Scheme: mgr.GetScheme(),
Log: ctrl.Log.WithName("controllers").WithName("Workspace"),
RestConfig: mgr.GetConfig(),
}
if err = workspaceReconciler.SetupWithManager(mgr); err != nil {
setupLog.Error(err, "unable to create controller", "controller", "Workspace")
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -823,6 +823,15 @@ spec:
workspaces:
description: Workspaces defines configuration for developer workspaces
properties:
autoPauseTimeoutMinutes:
description: |-
AutoPauseTimeoutMinutes is the cluster-wide default idle timeout in minutes before
a workspace is automatically paused. Overridden per-workspace via spec.autoPauseTimeoutMinutes.
Must be > 0. To disable auto-pause for a specific workspace, set its
spec.autoPauseTimeoutMinutes to 0.
Default: 30
format: int32
type: integer
buildCacheSize:
description: 'BuildCacheSize is the size of the PVC created for
build cache persistence (default: "20Gi")'
Expand Down
15 changes: 15 additions & 0 deletions config/crd/bases/automotive.sdv.cloud.redhat.com_workspaces.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -59,6 +59,15 @@ spec:
description: Architecture is the target architecture (e.g., "arm64",
"amd64")
type: string
autoPauseTimeoutMinutes:
description: |-
AutoPauseTimeoutMinutes overrides the global auto-pause timeout for this workspace.
nil = use global default from OperatorConfig (default: 30 minutes)
0 = disable auto-pause for this workspace
>0 = custom timeout in minutes
format: int32
minimum: 0
type: integer
clientConfigSecretRef:
description: |-
ClientConfigSecretRef is the name of the Secret containing the Jumpstarter client config
Expand Down Expand Up @@ -169,6 +178,12 @@ spec:
BuildCachePVCName is the name of the PVC used for build cache storage.
Created lazily on first build referencing this workspace.
type: string
lastActivityTime:
description: |-
LastActivityTime is the last time activity was detected in the workspace pod.
Used by the auto-pause controller to determine idle duration.
format: date-time
type: string
message:
description: Message provides additional detail about the current
phase
Expand Down
110 changes: 79 additions & 31 deletions internal/buildapi/workspace.go
Original file line number Diff line number Diff line change
Expand Up @@ -37,25 +37,28 @@ func shellQuote(s string) string {

// WorkspaceRequest is the payload to create a workspace.
type WorkspaceRequest struct {
Name string `json:"name"`
FromBuild string `json:"fromBuild,omitempty"` // ImageBuild name to extract lease from
Lease string `json:"lease,omitempty"` // Direct lease ID
Arch string `json:"architecture,omitempty"`
Image string `json:"toolchainImage,omitempty"`
ClientConfig string `json:"clientConfig,omitempty"` // Base64-encoded Jumpstarter client config
CPU string `json:"cpu,omitempty"` // CPU request (e.g., "1", "500m")
Memory string `json:"memory,omitempty"` // Memory request (e.g., "2Gi", "512Mi")
TmpfsBuildDir bool `json:"tmpfsBuildDir,omitempty"` // Mount tmpfs at /tmp/build for fast compilation
Name string `json:"name"`
FromBuild string `json:"fromBuild,omitempty"` // ImageBuild name to extract lease from
Lease string `json:"lease,omitempty"` // Direct lease ID
Arch string `json:"architecture,omitempty"`
Image string `json:"toolchainImage,omitempty"`
ClientConfig string `json:"clientConfig,omitempty"` // Base64-encoded Jumpstarter client config
CPU string `json:"cpu,omitempty"` // CPU request (e.g., "1", "500m")
Memory string `json:"memory,omitempty"` // Memory request (e.g., "2Gi", "512Mi")
TmpfsBuildDir bool `json:"tmpfsBuildDir,omitempty"` // Mount tmpfs at /tmp/build for fast compilation
AutoPauseTimeoutMinutes *int32 `json:"autoPauseTimeoutMinutes,omitempty"` // nil = use global default, 0 = disable
}

// WorkspaceResponse is returned by workspace operations.
type WorkspaceResponse struct {
Name string `json:"name"`
Phase string `json:"phase"`
Lease string `json:"lease,omitempty"`
Arch string `json:"architecture"`
PodName string `json:"podName,omitempty"`
Age string `json:"age,omitempty"`
Name string `json:"name"`
Phase string `json:"phase"`
Lease string `json:"lease,omitempty"`
Arch string `json:"architecture"`
PodName string `json:"podName,omitempty"`
Age string `json:"age,omitempty"`
AutoPauseTimeout string `json:"autoPauseTimeout,omitempty"` // e.g., "30m", "disabled"
LastActivity string `json:"lastActivity,omitempty"` // e.g., "2m ago", "just now"
}

// WorkspaceExecRequest is the payload to execute a command in a workspace.
Expand Down Expand Up @@ -283,16 +286,17 @@ func (a *APIServer) createWorkspace(c *gin.Context) {
Namespace: namespace,
},
Spec: automotivev1alpha1.WorkspaceSpec{
Architecture: arch,
Image: image,
LeaseID: leaseID,
Owner: requester,
ClientConfigSecretRef: jmpClientSecret,
PVCSize: pvcSize,
Resources: resources,
StorageClass: wsConfig.GetStorageClass(),
NodeSelector: wsConfig.GetNodeSelector(),
TmpfsBuildDir: req.TmpfsBuildDir,
Architecture: arch,
Image: image,
LeaseID: leaseID,
Owner: requester,
ClientConfigSecretRef: jmpClientSecret,
PVCSize: pvcSize,
Resources: resources,
StorageClass: wsConfig.GetStorageClass(),
NodeSelector: wsConfig.GetNodeSelector(),
TmpfsBuildDir: req.TmpfsBuildDir,
AutoPauseTimeoutMinutes: req.AutoPauseTimeoutMinutes,
},
}
if err := k8sClient.Create(c.Request.Context(), ws); err != nil {
Expand Down Expand Up @@ -459,6 +463,24 @@ func (a *APIServer) getOwnedWorkspace(c *gin.Context, name string) (*automotivev
return ws, nil
}

// touchWorkspaceActivity updates LastActivityTime on the workspace status.
// Called from handlers that represent actual workspace usage (exec, shell, sync, deploy)
// so the auto-pause controller knows the workspace is in use.
func (a *APIServer) touchWorkspaceActivity(c *gin.Context, ws *automotivev1alpha1.Workspace) {
if ws.Spec.Stopped {
return
}
k8sClient, err := getClientFromRequest(c)
if err != nil {
return // best-effort, don't fail the operation
}

now := metav1.Now()
patch := client.MergeFrom(ws.DeepCopy())
ws.Status.LastActivityTime = &now
_ = k8sClient.Status().Patch(c.Request.Context(), ws, patch)
}
Comment on lines +466 to +482

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Race condition between activity touch and auto-pause controller.

This function patches LastActivityTime using a stale ws object (fetched at handler entry). If called after the controller decides to auto-pause but before setStatus clears LastActivityTime, this patch may:

  1. Fail with a conflict error (silently ignored)
  2. Succeed and set activity time on a workspace that's already being stopped

While the best-effort approach is reasonable for activity tracking, consider re-fetching the workspace before patching, or at minimum checking that ws.Spec.Stopped is still false:

Suggested improvement
 func (a *APIServer) touchWorkspaceActivity(c *gin.Context, ws *automotivev1alpha1.Workspace) {
+	// Don't touch activity on workspaces being stopped
+	if ws.Spec.Stopped {
+		return
+	}
 	k8sClient, err := getClientFromRequest(c)
 	if err != nil {
 		return // best-effort, don't fail the operation
 	}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/buildapi/workspace.go` around lines 466 - 479,
touchWorkspaceActivity currently patches ws.Status.LastActivityTime using the
original ws which can race with the auto-pause controller; change the function
to re-fetch the latest Workspace from the API (using k8sClient.Get) immediately
before creating the MergeFrom patch, check that the fresh object's Spec.Stopped
is false (abort if true), update the fresh object's Status.LastActivityTime and
then call k8sClient.Status().Patch with the MergeFrom of that fresh copy, and
handle/ignore conflict errors as best-effort (do not blindly patch the stale ws
or overwrite stopped state).


func (a *APIServer) syncWorkspace(c *gin.Context, name string) {
ws, err := a.getOwnedWorkspace(c, name)
if err != nil {
Expand All @@ -468,6 +490,7 @@ func (a *APIServer) syncWorkspace(c *gin.Context, name string) {
c.JSON(http.StatusConflict, gin.H{"error": fmt.Sprintf("workspace %q is not running (phase: %s)", name, ws.Status.Phase)})
return
}
a.touchWorkspaceActivity(c, ws)

namespace := ws.Namespace
podName := ws.Status.PodName
Expand Down Expand Up @@ -590,6 +613,7 @@ func (a *APIServer) execWorkspace(c *gin.Context, name string) {
c.JSON(http.StatusConflict, gin.H{"error": fmt.Sprintf("workspace %q is not running (phase: %s)", name, ws.Status.Phase)})
return
}
a.touchWorkspaceActivity(c, ws)

restCfg, err := getRESTConfigFromRequest(c)
if err != nil {
Expand Down Expand Up @@ -618,6 +642,7 @@ func (a *APIServer) shellWorkspace(c *gin.Context, name string) {
c.JSON(http.StatusConflict, gin.H{"error": fmt.Sprintf("workspace %q is not running (phase: %s)", name, ws.Status.Phase)})
return
}
a.touchWorkspaceActivity(c, ws)

namespace := ws.Namespace
podName := ws.Status.PodName
Expand Down Expand Up @@ -744,6 +769,7 @@ func (a *APIServer) deployWorkspace(c *gin.Context, name string) {
c.JSON(http.StatusConflict, gin.H{"error": fmt.Sprintf("workspace %q is not running (phase: %s)", name, ws.Status.Phase)})
return
}
a.touchWorkspaceActivity(c, ws)

if ws.Spec.LeaseID == "" {
c.JSON(http.StatusBadRequest, gin.H{"error": "no Jumpstarter lease associated with this workspace"})
Expand Down Expand Up @@ -914,13 +940,35 @@ func workspaceResponseFromCR(ws *automotivev1alpha1.Workspace) WorkspaceResponse
if !ws.CreationTimestamp.IsZero() {
age = time.Since(ws.CreationTimestamp.Time).Truncate(time.Second).String()
}
var autoPauseTimeout string
switch {
case ws.Spec.AutoPauseTimeoutMinutes == nil:
autoPauseTimeout = "default"
case *ws.Spec.AutoPauseTimeoutMinutes == 0:
autoPauseTimeout = "disabled"
default:
autoPauseTimeout = fmt.Sprintf("%dm", *ws.Spec.AutoPauseTimeoutMinutes)
Comment on lines +943 to +950

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Don't render the compiled-in default as the effective timeout.

nil here means "inherit from OperatorConfig", but this branch always prints DefaultAutoPauseTimeoutMinutes. Any namespace with a custom global timeout — or a global disable — will return the wrong autoPauseTimeout in list/show responses.

Minimal safe fix
 	case ws.Spec.AutoPauseTimeoutMinutes == nil:
-		autoPauseTimeout = fmt.Sprintf("default (%dm)", automotivev1alpha1.DefaultAutoPauseTimeoutMinutes)
+		autoPauseTimeout = "default"
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
var autoPauseTimeout string
switch {
case ws.Spec.AutoPauseTimeoutMinutes == nil:
autoPauseTimeout = fmt.Sprintf("default (%dm)", automotivev1alpha1.DefaultAutoPauseTimeoutMinutes)
case *ws.Spec.AutoPauseTimeoutMinutes == 0:
autoPauseTimeout = "disabled"
default:
autoPauseTimeout = fmt.Sprintf("%dm", *ws.Spec.AutoPauseTimeoutMinutes)
var autoPauseTimeout string
switch {
case ws.Spec.AutoPauseTimeoutMinutes == nil:
autoPauseTimeout = "default"
case *ws.Spec.AutoPauseTimeoutMinutes == 0:
autoPauseTimeout = "disabled"
default:
autoPauseTimeout = fmt.Sprintf("%dm", *ws.Spec.AutoPauseTimeoutMinutes)
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/buildapi/workspace.go` around lines 943 - 950, The code currently
prints the compiled DefaultAutoPauseTimeoutMinutes whenever
ws.Spec.AutoPauseTimeoutMinutes is nil, but nil means "inherit from
OperatorConfig"; update the nil branch to read the operator-level setting
instead: fetch the OperatorConfig's AutoPauseTimeoutMinutes (e.g. via your
reconciler/helper that exposes operatorConfig or a getOperatorConfig call), then
if that operator value is nil fall back to
automotivev1alpha1.DefaultAutoPauseTimeoutMinutes, if the operator value is 0
print "disabled", otherwise print the operator value with "%dm". Use the same
semantics for 0/ptr vs nil as with ws.Spec.AutoPauseTimeoutMinutes (reference
symbols: ws.Spec.AutoPauseTimeoutMinutes,
automotivev1alpha1.DefaultAutoPauseTimeoutMinutes, and
OperatorConfig.AutoPauseTimeoutMinutes or your reconciler method that returns
it).

}

lastActivity := ""
if ws.Status.LastActivityTime != nil {
elapsed := time.Since(ws.Status.LastActivityTime.Time)
if elapsed < time.Minute {
lastActivity = "just now"
} else {
lastActivity = elapsed.Truncate(time.Minute).String() + " ago"
}
}

return WorkspaceResponse{
Name: ws.Name,
Phase: phase,
Lease: ws.Spec.LeaseID,
Arch: ws.Spec.Architecture,
PodName: ws.Status.PodName,
Age: age,
Name: ws.Name,
Phase: phase,
Lease: ws.Spec.LeaseID,
Arch: ws.Spec.Architecture,
PodName: ws.Status.PodName,
Age: age,
AutoPauseTimeout: autoPauseTimeout,
LastActivity: lastActivity,
}
}

Expand Down
Loading
Loading