t1165.2: Container pool manager for supervisor by marcusquinn · Pull Request #2184 · marcusquinn/aidevops

marcusquinn · 2026-02-23T01:18:01Z

Summary

Adds container pool manager module (container-pool.sh) to the supervisor for managing Docker/OrbStack container pools
Implements spawn/destroy lifecycle, health checks, round-robin dispatch selection, and per-container rate limit tracking
Integrates with dispatch.sh via opt-in SUPERVISOR_USE_CONTAINER_POOL=true environment variable

What's included

New files

.agents/scripts/supervisor/container-pool.sh — Pool manager module (849 lines)
tests/test-container-pool.sh — 45 unit tests covering all pool operations

Modified files

.agents/scripts/supervisor/database.sh — Schema migration for container_pool and container_dispatch_log tables
.agents/scripts/supervisor/dispatch.sh — Pool-aware dispatch hook (selects container via round-robin before dispatch)
.agents/scripts/supervisor-helper.sh — Module source + pool command routing

Features

Spawn/destroy: Docker containers with OAuth token injection (gopass or env var), volume mounts for agents
Health checks: Docker state verification + rate limit cooldown expiry detection
Round-robin dispatch: Selects healthy container with oldest last_dispatch_at timestamp
Rate limit tracking: Per-container cooldown with configurable duration, auto-triggered on rate_limited outcome
Idle cleanup: Destroys containers idle beyond timeout, respects CONTAINER_POOL_MIN
CLI: supervisor-helper.sh pool {spawn|destroy|list|status|health|select|rate-limit|cleanup|help}

Quality

ShellCheck clean (zero violations)
45/45 unit tests passing
All functions have explicit returns
local var="$1" pattern throughout

Ref #1764

Summary by CodeRabbit

Release Notes

New Features
- Added container pool management functionality accessible via the pool command
- Pool supports spawning and destroying containers with automatic lifecycle management
- Built-in health checking and round-robin container selection for workload distribution
- Rate limiting capabilities to control container usage
- Pool introspection tools including status listing and statistics
Tests
- Added comprehensive test coverage for pool management operations

Implements container pool management in the supervisor with: - Spawn/destroy Docker containers with OAuth token injection - Health checks (Docker state + rate limit cooldown expiry) - Round-robin dispatch across healthy containers (oldest-first) - Per-container rate limit tracking with configurable cooldown - Idle container cleanup respecting pool minimum size - CLI: supervisor-helper.sh pool {spawn|destroy|list|status|health|select|rate-limit|cleanup} - DB schema: container_pool + container_dispatch_log tables - Integration hook in dispatch.sh (opt-in via SUPERVISOR_USE_CONTAINER_POOL=true) - 45/45 unit tests passing, ShellCheck clean

gemini-code-assist · 2026-02-23T01:18:19Z

Summary of Changes

Hello @marcusquinn, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the supervisor's capabilities by introducing a sophisticated container pooling system. This system automates the management of worker containers, ensuring efficient resource utilization, load balancing through round-robin dispatch, and resilience against rate limits or unhealthy containers. It provides a scalable foundation for executing tasks in parallel, improving overall system performance and reliability.

Highlights

New Container Pool Manager: A new module, container-pool.sh, has been added to the supervisor to manage a pool of Docker/OrbStack containers for parallel task dispatch.
Container Lifecycle and Management: The manager implements full container lifecycle operations including spawning, destroying, health checks, round-robin dispatch selection, and per-container rate limit tracking with configurable cooldowns.
Database Integration: New database tables, container_pool and container_dispatch_log, have been introduced to store container registry information, health/rate-limit states, and dispatch history.
Dispatch System Integration: The dispatch.sh script has been updated to optionally integrate with the container pool, allowing tasks to be routed to available containers based on a round-robin strategy via the SUPERVISOR_USE_CONTAINER_POOL environment variable.
Comprehensive Testing and CLI: The new module includes 45 unit tests covering all pool operations and exposes a pool subcommand through supervisor-helper.sh for command-line management.

Changelog

.agents/scripts/supervisor-helper.sh
- Added pool subcommand routing to integrate the new container pool manager.
- Included the new container-pool.sh module.
.agents/scripts/supervisor/container-pool.sh
- Introduced a new module for managing Docker/OrbStack container pools.
- Defined schema for container_pool and container_dispatch_log tables.
- Implemented functions for container spawning, destruction, health checking, selection, and rate limit management.
- Provided CLI commands for interacting with the container pool.
.agents/scripts/supervisor/database.sh
- Added migration logic to create the container_pool and container_dispatch_log tables if they do not exist.
- Included a call to _create_container_pool_schema during database initialization.
.agents/scripts/supervisor/dispatch.sh
- Integrated logic to select a container from the pool using pool_select_container for task dispatch when enabled.
- Added functionality to record task dispatches to specific containers using pool_record_dispatch.
tests/test-container-pool.sh
- Added a new suite of 45 unit tests for the container-pool.sh module.
- Tests cover schema creation, container spawning and destruction, health checks, round-robin selection, rate limit tracking, idle cleanup, pool statistics, and CLI command routing.

Activity

No specific activity (comments, reviews, progress) was provided in the context for this pull request.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

coderabbitai · 2026-02-23T01:18:20Z

Walkthrough

This PR introduces a comprehensive container pool management system to the supervisor framework. It adds a new pool subcommand with SQLite-backed lifecycle management, health monitoring, round-robin container selection with rate limiting, dispatch logging, and idle cleanup.

Changes

Cohort / File(s)	Summary
CLI Entry & Module Routing `.agents/scripts/supervisor-helper.sh`	Adds `pool` subcommand dispatch to `cmd_pool` and sources the new `container-pool.sh` module; updated help/usage text.
Container Pool Implementation `.agents/scripts/supervisor/container-pool.sh`	Introduces comprehensive pool manager (~850 lines): container spawn/destroy, per-container and pool-wide health checks, round-robin selection with rate-limit awareness, dispatch logging/completion, rate-limit management, idle cleanup, pool statistics, and CLI routing via `cmd_pool`.
Database Integration `.agents/scripts/supervisor/database.sh`	Adds migration logic and initialization hook to create container pool schema (tables: `container_pool`, `container_dispatch_log`) with indices during `init_db` and migrations.
Dispatch Integration `.agents/scripts/supervisor/dispatch.sh`	Introduces optional pool-based dispatch path: when `SUPERVISOR_USE_CONTAINER_POOL` is enabled, selects containers via `pool_select_container` and records dispatch events.
Test Suite `tests/test-container-pool.sh`	Comprehensive test suite (~583 lines) covering schema creation, spawn/destroy, health checks, round-robin selection, rate limiting, completion recording, statistics, cleanup, CLI routing, and Docker invocation verification.

Sequence Diagram

sequenceDiagram
    participant CLI as supervisor-helper.sh pool
    participant PM as pool_*() functions
    participant DB as SQLite Database
    participant Docker as Docker CLI
    
    CLI->>PM: cmd_pool spawn [args]
    PM->>DB: Insert container record (status=starting)
    PM->>Docker: docker run (local or remote)
    Docker-->>PM: Container ID
    PM->>DB: Update with container ID & image
    
    rect rgba(100, 150, 200, 0.5)
    Note over PM,DB: Health Check Loop
    PM->>Docker: docker inspect
    PM->>DB: Update status & health_check_ts
    end
    
    rect rgba(150, 200, 100, 0.5)
    Note over PM,DB: Dispatch Flow (when enabled)
    CLI->>PM: pool_select_container()
    PM->>DB: Query healthy containers (skip rate-limited)
    PM-->>CLI: Container (round-robin)
    CLI->>PM: pool_record_dispatch(container_id)
    PM->>DB: Insert dispatch_log entry
    PM->>DB: Increment dispatch_count
    end
    
    rect rgba(200, 100, 100, 0.5)
    Note over PM,DB: Completion & Rate Limiting
    CLI->>PM: pool_record_completion(container_id, outcome)
    PM->>DB: Update completion_ts
    alt Rate Limited Outcome
        PM->>PM: pool_mark_rate_limited()
        PM->>DB: Set rate_limit_until cooldown
    end
    end
    
    CLI->>PM: pool_destroy_idle()
    PM->>DB: Query idle containers
    PM->>Docker: docker stop/rm
    PM->>DB: Update status=stopped

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Poem

🚀 A pool of containers, so ready and wise,
Health checks keeping watch with supervisory eyes,
Round-robin dispatch with rate limits in place,
SQLite tracks every container's grace,
From spawn to idle cleanup, orchestrated with care! 🌊

🚥 Pre-merge checks | ✅ 3

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title clearly summarizes the main change: adding a container pool manager module to the supervisor infrastructure.
Docstring Coverage	✅ Passed	Docstring coverage is 80.77% which is sufficient. The required threshold is 80.00%.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings (stacked PR)
📝 Generate docstrings (commit on current branch)

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch feature/t1165.2

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

github-actions · 2026-02-23T01:18:32Z

🔍 Code Quality Report

�[0;35m[MONITOR]�[0m Code Review Monitoring Report

�[0;34m[INFO]�[0m Latest Quality Status:
SonarCloud: 0 bugs, 0 vulnerabilities, 40 code smells

�[0;34m[INFO]�[0m Recent monitoring activity:
Mon Feb 23 01:18:29 UTC 2026: Code review monitoring started
Mon Feb 23 01:18:29 UTC 2026: SonarCloud - Bugs: 0, Vulnerabilities: 0, Code Smells: 40

📈 Current Quality Metrics

BUGS: 0
CODE SMELLS: 40
VULNERABILITIES: 0

Generated on: Mon Feb 23 01:18:32 UTC 2026

Generated by AI DevOps Framework Code Review Monitoring

sonarqubecloud · 2026-02-23T01:19:28Z

Quality Gate passed

Issues
1 New issue
0 Accepted issues

Measures
0 Security Hotspots
0.0% Coverage on New Code
0.0% Duplication on New Code

See analysis details on SonarQube Cloud

gemini-code-assist

Code Review

This is an excellent pull request that adds a significant and well-tested container pool manager feature. The new module is well-structured, and the inclusion of a comprehensive test suite is commendable. My review includes a critical fix to prevent error suppression and a suggestion to optimize database queries for better performance, both aligned with established repository rules.

gemini-code-assist · 2026-02-23T01:20:20Z

.agents/scripts/supervisor/dispatch.sh

+		pool_container_id=$(pool_select_container 2>/dev/null) || pool_container_id=""
+		if [[ -n "$pool_container_id" ]]; then
+			log_info "Container pool: selected $pool_container_id for $task_id (t1165.2)"
+			pool_record_dispatch "$pool_container_id" "$task_id" 2>/dev/null || true


The use of 2>/dev/null here suppresses all error output from pool_record_dispatch. This is risky because the function performs database operations that could fail for various reasons (e.g., database locked, disk full, schema mismatch). Suppressing these errors can hide critical problems and make debugging difficult. The project's general rules advise against suppressing stderr in such cases, and specifically recommend against blanket suppression of command errors to ensure issues remain visible for debugging. While || true is used to prevent script exit, 2>/dev/null should be removed to allow potential errors to be logged.

Suggested change

pool_record_dispatch "$pool_container_id" "$task_id" 2>/dev/null || true

pool_record_dispatch "$pool_container_id" "$task_id" || true

References

Avoid using '2>/dev/null' for blanket suppression of command errors in shell scripts to ensure that authentication, syntax, or system issues remain visible for debugging.

In shell scripts with 'set -e' enabled, use '|| true' to prevent the script from exiting when a command like 'jq' fails on an optional lookup. Do not suppress stderr with '2>/dev/null' so that actual syntax or system errors remain visible for debugging.

gemini-code-assist · 2026-02-23T01:20:20Z

.agents/scripts/supervisor/container-pool.sh

+	local total healthy unhealthy rate_limited stopped failed
+	total=$(db "$SUPERVISOR_DB" "SELECT COUNT(*) FROM container_pool;" 2>/dev/null || echo "0")
+	healthy=$(db "$SUPERVISOR_DB" "SELECT COUNT(*) FROM container_pool WHERE status='healthy';" 2>/dev/null || echo "0")
+	unhealthy=$(db "$SUPERVISOR_DB" "SELECT COUNT(*) FROM container_pool WHERE status='unhealthy';" 2>/dev/null || echo "0")
+	rate_limited=$(db "$SUPERVISOR_DB" "SELECT COUNT(*) FROM container_pool WHERE status='rate_limited';" 2>/dev/null || echo "0")
+	stopped=$(db "$SUPERVISOR_DB" "SELECT COUNT(*) FROM container_pool WHERE status='stopped';" 2>/dev/null || echo "0")
+	failed=$(db "$SUPERVISOR_DB" "SELECT COUNT(*) FROM container_pool WHERE status='failed';" 2>/dev/null || echo "0")


These six separate database queries to get status counts can be consolidated into a single, more efficient query using conditional aggregation. This reduces the overhead of multiple sqlite3 process invocations and database accesses, making the function faster and more atomic. This aligns with the rule to efficiently fetch multiple aggregate statistics from SQLite using a single query that returns a delimited string. Additionally, the 2>/dev/null in the suggested code should be removed to avoid blanket suppression of errors, ensuring that potential issues like database locks or syntax errors remain visible for debugging, as per repository guidelines. The || echo fallback is sufficient to provide default values without hiding errors.

Suggested change

local total healthy unhealthy rate_limited stopped failed

total=$(db "$SUPERVISOR_DB" "SELECT COUNT(*) FROM container_pool;" 2>/dev/null || echo "0")

healthy=$(db "$SUPERVISOR_DB" "SELECT COUNT(*) FROM container_pool WHERE status='healthy';" 2>/dev/null || echo "0")

unhealthy=$(db "$SUPERVISOR_DB" "SELECT COUNT(*) FROM container_pool WHERE status='unhealthy';" 2>/dev/null || echo "0")

rate_limited=$(db "$SUPERVISOR_DB" "SELECT COUNT(*) FROM container_pool WHERE status='rate_limited';" 2>/dev/null || echo "0")

stopped=$(db "$SUPERVISOR_DB" "SELECT COUNT(*) FROM container_pool WHERE status='stopped';" 2>/dev/null || echo "0")

failed=$(db "$SUPERVISOR_DB" "SELECT COUNT(*) FROM container_pool WHERE status='failed';" 2>/dev/null || echo "0")

local total healthy unhealthy rate_limited stopped failed

local stats_line

stats_line=$(db -separator ' ' "$SUPERVISOR_DB" "

SELECT

COALESCE(COUNT(*), 0),

COALESCE(SUM(CASE WHEN status = 'healthy' THEN 1 ELSE 0 END), 0),

COALESCE(SUM(CASE WHEN status = 'unhealthy' THEN 1 ELSE 0 END), 0),

COALESCE(SUM(CASE WHEN status = 'rate_limited' THEN 1 ELSE 0 END), 0),

COALESCE(SUM(CASE WHEN status = 'stopped' THEN 1 ELSE 0 END), 0),

COALESCE(SUM(CASE WHEN status = 'failed' THEN 1 ELSE 0 END), 0)

FROM container_pool;

" || echo "0 0 0 0 0 0")

read -r total healthy unhealthy rate_limited stopped failed <<<"$stats_line"

References

To efficiently fetch multiple aggregate statistics from SQLite in a shell script, use a single query with subselects that returns a delimited string, and parse it using IFS='|' read.

Avoid using '2>/dev/null' for blanket suppression of command errors in shell scripts to ensure that authentication, syntax, or system issues remain visible for debugging.

coderabbitai

Actionable comments posted: 5

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In @.agents/scripts/supervisor/container-pool.sh:
- Around line 182-197: The indirect expansion ${!token_ref} can fail under set
-u when token_ref contains characters invalid for an env var name (e.g. paths
like "foo/bar"); before falling back to the env-var lookup in the token
resolution block, validate token_ref is a legal shell identifier (e.g. matches
/^[a-zA-Z_][a-zA-Z0-9_]*$/) and only then perform the indirect expansion to set
token_value, otherwise skip that branch (and keep the existing warning path).
Update the code around token_ref/token_value/docker_args in the token injection
logic (the block that attempts gopass then env var) to perform this identifier
check and avoid indirect expansion errors.
- Around line 483-490: The SELECT that assigns selected in the container-pool
script must exclude containers with active/incomplete dispatches: update the SQL
inside the db "$SUPERVISOR_DB" call (the query that selects from container_pool)
to add an exclusion such as AND NOT EXISTS (SELECT 1 FROM dispatch WHERE
dispatch.container_id = container_pool.id AND dispatch.completed_at IS NULL)
(or, if your schema uses a status column, AND NOT EXISTS (SELECT 1 FROM dispatch
WHERE dispatch.container_id = container_pool.id AND dispatch.status NOT IN
('succeeded','failed','cancelled'))). This change should be made where selected
is set so the round-robin never picks a container that currently has an
unfinished dispatch.
- Around line 567-577: pool_mark_rate_limited currently interpolates the
cooldown directly into the SQL timestamp expression; validate and sanitize the
cooldown before interpolation by ensuring it's a non-negative integer (e.g.
check with a digit-only regex or coerce with printf '%d' and fallback to the
default CONTAINER_POOL_RATE_LIMIT_COOLDOWN if invalid), optionally clamp to a
reasonable max, then use that sanitized numeric variable in the SQL (leave
sql_escape for container_id unchanged). Update the pool_mark_rate_limited
function to perform this validation/fallback and reference the sanitized
variable (e.g. cooldown_seconds) in the strftime('+${cooldown_seconds} seconds')
fragment.

In @.agents/scripts/supervisor/database.sh:
- Around line 767-776: ensure_db and init_db call the helper
_create_container_pool_schema which is defined in container-pool.sh but
database.sh doesn’t source that file; update database.sh to ensure the function
is available before calling it by adding a defensive check in ensure_db/init_db:
test whether _create_container_pool_schema is defined (e.g., via declare -f or
type) and if not, source container-pool.sh (or abort with a clear error), or
alternatively source container-pool.sh near the top of database.sh to guarantee
availability; reference ensure_db, init_db, and _create_container_pool_schema
when making the change.

In @.agents/scripts/supervisor/dispatch.sh:
- Around line 3222-3234: The pool selection is never wired into routing—after
getting pool_container_id you must set and persist the task's dispatch target
and short-circuit to a pool-specific dispatch branch so the selected container
is actually used. Concretely: when pool_select_container returns a non-empty
pool_container_id, set dispatch_target="pool:$pool_container_id", persist that
value to whatever task metadata/store the dispatcher later reads (so downstream
code sees it), and route the task through the pool-specific dispatch path that
sends the job to that container (and keep calling pool_record_dispatch to update
metrics). Alternatively, only invoke pool_select_container when the subsequent
routing supports a "pool:" dispatch_target; ensure any routing logic (the parts
that read dispatch_target) recognizes the "pool:<id>" prefix and sends the task
to that container.

coderabbitai · 2026-02-23T01:25:54Z

.agents/scripts/supervisor/container-pool.sh

+	# Inject OAuth token if provided
+	if [[ -n "$token_ref" ]]; then
+		local token_value=""
+		# Try gopass first, then env var
+		if command -v gopass &>/dev/null; then
+			token_value=$(gopass show "$token_ref" 2>/dev/null || echo "")
+		fi
+		if [[ -z "$token_value" ]]; then
+			# Try as env var name
+			token_value="${!token_ref:-}"
+		fi
+		if [[ -n "$token_value" ]]; then
+			docker_args+=("-e" "CLAUDE_CODE_OAUTH_TOKEN=$token_value")
+		else
+			log_warn "OAuth token ref '$token_ref' could not be resolved — container may lack auth"
+		fi


⚠️ Potential issue | 🟠 Major

Guard env-var fallback for non‑identifier token refs.

If gopass isn’t available and --token-ref is a path (e.g., foo/bar), ${!token_ref} triggers an “invalid indirect expansion” under set -u and aborts pool_spawn. Only attempt indirect expansion for valid env var names.

Suggested fix

- if [[ -z "$token_value" ]]; then - # Try as env var name - token_value="${!token_ref:-}" - fi + if [[ -z "$token_value" ]]; then + # Try as env var name only when token_ref is a valid identifier + if [[ "$token_ref" =~ ^[A-Za-z_][A-Za-z0-9_]*$ ]]; then + token_value="${!token_ref:-}" + fi + fi

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

# Inject OAuth token if provided

if [[ -n "$token_ref" ]]; then

local token_value=""

# Try gopass first, then env var

if command -v gopass &>/dev/null; then

token_value=$(gopass show "$token_ref" 2>/dev/null || echo "")

fi

if [[ -z "$token_value" ]]; then

# Try as env var name

token_value="${!token_ref:-}"

fi

if [[ -n "$token_value" ]]; then

docker_args+=("-e" "CLAUDE_CODE_OAUTH_TOKEN=$token_value")

else

log_warn "OAuth token ref '$token_ref' could not be resolved — container may lack auth"

fi

# Inject OAuth token if provided

if [[ -n "$token_ref" ]]; then

local token_value=""

# Try gopass first, then env var

if command -v gopass &>/dev/null; then

token_value=$(gopass show "$token_ref" 2>/dev/null || echo "")

fi

if [[ -z "$token_value" ]]; then

# Try as env var name only when token_ref is a valid identifier

if [[ "$token_ref" =~ ^[A-Za-z_][A-Za-z0-9_]*$ ]]; then

token_value="${!token_ref:-}"

fi

fi

if [[ -n "$token_value" ]]; then

docker_args+=("-e" "CLAUDE_CODE_OAUTH_TOKEN=$token_value")

else

log_warn "OAuth token ref '$token_ref' could not be resolved — container may lack auth"

fi

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In @.agents/scripts/supervisor/container-pool.sh around lines 182 - 197, The indirect expansion ${!token_ref} can fail under set -u when token_ref contains characters invalid for an env var name (e.g. paths like "foo/bar"); before falling back to the env-var lookup in the token resolution block, validate token_ref is a legal shell identifier (e.g. matches /^[a-zA-Z_][a-zA-Z0-9_]*$/) and only then perform the indirect expansion to set token_value, otherwise skip that branch (and keep the existing warning path). Update the code around token_ref/token_value/docker_args in the token injection logic (the block that attempts gopass then env var) to perform this identifier check and avoid indirect expansion errors.

coderabbitai · 2026-02-23T01:25:54Z

.agents/scripts/supervisor/container-pool.sh

+	local selected
+	selected=$(db "$SUPERVISOR_DB" "
+		SELECT id FROM container_pool
+		WHERE status = 'healthy'
+		  AND (rate_limit_until IS NULL OR rate_limit_until <= strftime('%Y-%m-%dT%H:%M:%SZ','now'))
+		  $host_clause
+		ORDER BY COALESCE(last_dispatch_at, '1970-01-01T00:00:00Z') ASC
+		LIMIT 1;


⚠️ Potential issue | 🟠 Major

Exclude containers with active dispatches from selection.

Round‑robin should not pick a container that already has an incomplete dispatch. Otherwise a busy container can receive multiple tasks.

Suggested fix

selected=$(db "$SUPERVISOR_DB" " SELECT id FROM container_pool WHERE status = 'healthy' AND (rate_limit_until IS NULL OR rate_limit_until <= strftime('%Y-%m-%dT%H:%M:%SZ','now')) $host_clause + AND NOT EXISTS ( + SELECT 1 FROM container_dispatch_log cdl + WHERE cdl.container_id = container_pool.id AND cdl.completed_at IS NULL + ) ORDER BY COALESCE(last_dispatch_at, '1970-01-01T00:00:00Z') ASC LIMIT 1; " 2>/dev/null || echo "")

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

local selected

selected=$(db "$SUPERVISOR_DB" "

SELECT id FROM container_pool

WHERE status = 'healthy'

AND (rate_limit_until IS NULL OR rate_limit_until <= strftime('%Y-%m-%dT%H:%M:%SZ','now'))

$host_clause

ORDER BY COALESCE(last_dispatch_at, '1970-01-01T00:00:00Z') ASC

LIMIT 1;

local selected

selected=$(db "$SUPERVISOR_DB" "

SELECT id FROM container_pool

WHERE status = 'healthy'

AND (rate_limit_until IS NULL OR rate_limit_until <= strftime('%Y-%m-%dT%H:%M:%SZ','now'))

$host_clause

AND NOT EXISTS (

SELECT 1 FROM container_dispatch_log cdl

WHERE cdl.container_id = container_pool.id AND cdl.completed_at IS NULL

)

ORDER BY COALESCE(last_dispatch_at, '1970-01-01T00:00:00Z') ASC

LIMIT 1;

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In @.agents/scripts/supervisor/container-pool.sh around lines 483 - 490, The SELECT that assigns selected in the container-pool script must exclude containers with active/incomplete dispatches: update the SQL inside the db "$SUPERVISOR_DB" call (the query that selects from container_pool) to add an exclusion such as AND NOT EXISTS (SELECT 1 FROM dispatch WHERE dispatch.container_id = container_pool.id AND dispatch.completed_at IS NULL) (or, if your schema uses a status column, AND NOT EXISTS (SELECT 1 FROM dispatch WHERE dispatch.container_id = container_pool.id AND dispatch.status NOT IN ('succeeded','failed','cancelled'))). This change should be made where selected is set so the round-robin never picks a container that currently has an unfinished dispatch.

coderabbitai · 2026-02-23T01:25:54Z

.agents/scripts/supervisor/container-pool.sh

+pool_mark_rate_limited() {
+	local container_id="$1"
+	local cooldown="${2:-$CONTAINER_POOL_RATE_LIMIT_COOLDOWN}"
+
+	db "$SUPERVISOR_DB" "
+		UPDATE container_pool
+		SET status = 'rate_limited',
+		    rate_limit_until = strftime('%Y-%m-%dT%H:%M:%SZ','now','+${cooldown} seconds'),
+		    rate_limit_count = rate_limit_count + 1,
+		    updated_at = strftime('%Y-%m-%dT%H:%M:%SZ','now')
+		WHERE id = '$(sql_escape "$container_id")';


⚠️ Potential issue | 🟠 Major

Validate cooldown to prevent SQL injection / malformed timestamps.

cooldown is interpolated directly into SQL; a non‑numeric value can break the query or inject SQL. Validate as an integer and fall back to the default.

Suggested fix

pool_mark_rate_limited() { local container_id="$1" local cooldown="${2:-$CONTAINER_POOL_RATE_LIMIT_COOLDOWN}" + + if ! [[ "$cooldown" =~ ^[0-9]+$ ]]; then + log_warn "Invalid cooldown '$cooldown' — falling back to default" + cooldown="$CONTAINER_POOL_RATE_LIMIT_COOLDOWN" + fi db "$SUPERVISOR_DB" " UPDATE container_pool

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

pool_mark_rate_limited() {

local container_id="$1"

local cooldown="${2:-$CONTAINER_POOL_RATE_LIMIT_COOLDOWN}"

db "$SUPERVISOR_DB" "

UPDATE container_pool

SET status = 'rate_limited',

rate_limit_until = strftime('%Y-%m-%dT%H:%M:%SZ','now','+${cooldown} seconds'),

rate_limit_count = rate_limit_count + 1,

updated_at = strftime('%Y-%m-%dT%H:%M:%SZ','now')

WHERE id = '$(sql_escape "$container_id")';

pool_mark_rate_limited() {

local container_id="$1"

local cooldown="${2:-$CONTAINER_POOL_RATE_LIMIT_COOLDOWN}"

if ! [[ "$cooldown" =~ ^[0-9]+$ ]]; then

log_warn "Invalid cooldown '$cooldown' — falling back to default"

cooldown="$CONTAINER_POOL_RATE_LIMIT_COOLDOWN"

fi

db "$SUPERVISOR_DB" "

UPDATE container_pool

SET status = 'rate_limited',

rate_limit_until = strftime('%Y-%m-%dT%H:%M:%SZ','now','+${cooldown} seconds'),

rate_limit_count = rate_limit_count + 1,

updated_at = strftime('%Y-%m-%dT%H:%M:%SZ','now')

WHERE id = '$(sql_escape "$container_id")';

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In @.agents/scripts/supervisor/container-pool.sh around lines 567 - 577, pool_mark_rate_limited currently interpolates the cooldown directly into the SQL timestamp expression; validate and sanitize the cooldown before interpolation by ensuring it's a non-negative integer (e.g. check with a digit-only regex or coerce with printf '%d' and fallback to the default CONTAINER_POOL_RATE_LIMIT_COOLDOWN if invalid), optionally clamp to a reasonable max, then use that sanitized numeric variable in the SQL (leave sql_escape for container_id unchanged). Update the pool_mark_rate_limited function to perform this validation/fallback and reference the sanitized variable (e.g. cooldown_seconds) in the strftime('+${cooldown_seconds} seconds') fragment.

coderabbitai · 2026-02-23T01:25:54Z

.agents/scripts/supervisor/database.sh

+	# Migrate: create container_pool tables if missing (t1165.2)
+	# Container pool manager — spawn/destroy containers, health checks,
+	# round-robin dispatch, per-container rate limit tracking.
+	local has_container_pool
+	has_container_pool=$(db "$SUPERVISOR_DB" "SELECT count(*) FROM sqlite_master WHERE type='table' AND name='container_pool';" 2>/dev/null || echo "0")
+	if [[ "$has_container_pool" -eq 0 ]]; then
+		log_info "Creating container_pool tables (t1165.2)..."
+		_create_container_pool_schema
+		log_success "Created container_pool and container_dispatch_log tables (t1165.2)"
+	fi


⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash # Find all scripts that source database.sh echo "=== Scripts sourcing database.sh ===" rg -n "source .*database\.sh|^\s*\. .*database\.sh" -g '*.sh' --no-heading echo -e "\n=== Scripts sourcing container-pool.sh ===" rg -n "source .*container-pool\.sh|^\s*\. .*container-pool\.sh" -g '*.sh' --no-heading echo -e "\n=== Entry points (main scripts in supervisor/) ===" fd -t f -e sh "^[^.]" .agents/scripts/supervisor/ | head -20 echo -e "\n=== Check supervisor/ directory structure ===" ls -la .agents/scripts/supervisor/ | grep -E "\.sh$"

Repository: marcusquinn/aidevops

Length of output: 3856

🏁 Script executed:

#!/bin/bash echo "=== Check backfill-blocked-comments.sh calls to ensure_db/init_db ===" rg -n "ensure_db|init_db" .agents/scripts/supervisor/backfill-blocked-comments.sh echo -e "\n=== Check test-dual-cli-e2e.sh calls to ensure_db/init_db (sample) ===" rg -n "ensure_db|init_db" tests/test-dual-cli-e2e.sh | head -10 echo -e "\n=== Check test-dispatch-claude-cli.sh calls to ensure_db/init_db (sample) ===" rg -n "ensure_db|init_db" tests/test-dispatch-claude-cli.sh | head -10 echo -e "\n=== Check if _create_container_pool_schema is conditionally defined ===" rg -n "_create_container_pool_schema" .agents/scripts/supervisor/database.sh echo -e "\n=== Check sourcing order in supervisor-helper.sh ===" sed -n '210,245p' .agents/scripts/supervisor-helper.sh

Repository: marcusquinn/aidevops

Length of output: 2290

🏁 Script executed:

#!/bin/bash echo "=== Check for module-level calls to ensure_db/init_db in database.sh ===" # Look for calls outside of function definitions rg -n "ensure_db|init_db" .agents/scripts/supervisor/database.sh | head -20 echo -e "\n=== Look at the actual function definitions and calls in database.sh ===" rg -B2 -A5 "^ensure_db|^init_db" .agents/scripts/supervisor/database.sh | head -40 echo -e "\n=== Check if backfill-blocked-comments.sh actually needs container-pool ===" cat .agents/scripts/supervisor/backfill-blocked-comments.sh

Repository: marcusquinn/aidevops

Length of output: 4504

🏁 Script executed:

#!/bin/bash echo "=== Context around lines 941 and 973 (ensure_db calls) ===" sed -n '935,945p' .agents/scripts/supervisor/database.sh echo "---" sed -n '968,978p' .agents/scripts/supervisor/database.sh echo -e "\n=== Check what function/block contains these calls ===" sed -n '900,950p' .agents/scripts/supervisor/database.sh | cat -n

Repository: marcusquinn/aidevops

Length of output: 2950

🏁 Script executed:

#!/bin/bash echo "=== Check end of database.sh for module-level invocations ===" tail -50 .agents/scripts/supervisor/database.sh echo -e "\n=== Check if database.sh sources container-pool.sh internally ===" head -50 .agents/scripts/supervisor/database.sh | grep -i "source.*container-pool"

Repository: marcusquinn/aidevops

Length of output: 1368

Potential scoping issue: _create_container_pool_schema is called by ensure_db/init_db but may not be defined at call time.

ensure_db (lines 767-776) and init_db (line 930) invoke _create_container_pool_schema, which is defined in container-pool.sh. However, database.sh does not source container-pool.sh internally. While supervisor-helper.sh correctly sources container-pool.sh after database.sh, direct consumers of database.sh (such as backfill-blocked-comments.sh or test utilities) may source only database.sh without container-pool.sh. This creates a fragile dependency: if any code path manually sources database.sh and later calls ensure_db or init_db, it will fail with an undefined function error.

Recommendation: Either (1) add source container-pool.sh at the start of database.sh to guarantee the function is available, (2) move _create_container_pool_schema into database.sh, or (3) add a defensive check in ensure_db/init_db to validate the function exists before calling it.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In @.agents/scripts/supervisor/database.sh around lines 767 - 776, ensure_db and init_db call the helper _create_container_pool_schema which is defined in container-pool.sh but database.sh doesn’t source that file; update database.sh to ensure the function is available before calling it by adding a defensive check in ensure_db/init_db: test whether _create_container_pool_schema is defined (e.g., via declare -f or type) and if not, source container-pool.sh (or abort with a clear error), or alternatively source container-pool.sh near the top of database.sh to guarantee availability; reference ensure_db, init_db, and _create_container_pool_schema when making the change.

coderabbitai · 2026-02-23T01:25:54Z

.agents/scripts/supervisor/dispatch.sh

+	# t1165.2: Container pool dispatch — if pool has healthy containers,
+	# select one via round-robin and record the dispatch. The container ID
+	# is stored in dispatch_target as "pool:<container_id>" for tracking.
+	# Pool dispatch is opt-in: only used when SUPERVISOR_USE_CONTAINER_POOL=true
+	# or when the task has dispatch_target=pool.
+	local pool_container_id=""
+	if [[ "${SUPERVISOR_USE_CONTAINER_POOL:-false}" == "true" ]]; then
+		pool_container_id=$(pool_select_container 2>/dev/null) || pool_container_id=""
+		if [[ -n "$pool_container_id" ]]; then
+			log_info "Container pool: selected $pool_container_id for $task_id (t1165.2)"
+			pool_record_dispatch "$pool_container_id" "$task_id" 2>/dev/null || true
+		fi
+	fi


⚠️ Potential issue | 🟠 Major

Pool selection is recorded but never used for routing.

pool_select_container sets pool_container_id and updates pool stats, yet the dispatch still follows the normal local/remote path and never uses the selected container. This makes pool metrics incorrect and the comment about dispatch_target=pool isn’t honored. Either wire the selected container into a pool-specific dispatch branch (and persist it on the task), or gate selection so it only runs when an actual pool dispatch path exists.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In @.agents/scripts/supervisor/dispatch.sh around lines 3222 - 3234, The pool selection is never wired into routing—after getting pool_container_id you must set and persist the task's dispatch target and short-circuit to a pool-specific dispatch branch so the selected container is actually used. Concretely: when pool_select_container returns a non-empty pool_container_id, set dispatch_target="pool:$pool_container_id", persist that value to whatever task metadata/store the dispatcher later reads (so downstream code sees it), and route the task through the pool-specific dispatch path that sends the job to that container (and keep calling pool_record_dispatch to update metrics). Alternatively, only invoke pool_select_container when the subsequent routing supports a "pool:" dispatch_target; ensure any routing logic (the parts that read dispatch_target) recognizes the "pool:<id>" prefix and sends the task to that container.

gemini-code-assist bot reviewed Feb 23, 2026

View reviewed changes

marcusquinn mentioned this pull request Feb 23, 2026

[Supervisor:marcusquinn] 0 queued, 0 working, 1 in review at 14:46 UTC #1502

Closed

coderabbitai bot requested changes Feb 23, 2026

View reviewed changes

marcusquinn merged commit 0d7f6d8 into main Feb 23, 2026
27 checks passed

marcusquinn deleted the feature/t1165.2 branch February 23, 2026 01:57

marcusquinn mentioned this pull request Feb 23, 2026

t1165.2: Container pool manager in supervisor #1764

Closed

31 tasks

marcusquinn added a commit that referenced this pull request Feb 23, 2026

chore: mark t1165.2 complete in TODO.md (#2184)

cbc0c31

This was referenced Feb 23, 2026

t1160: Dual-CLI Architecture: OpenCode primary + Claude Code CLI fallback #1746

Closed

t1165: Containerized Claude Code CLI instances for multi-subscription scaling #1762

Closed

This was referenced Feb 23, 2026

[Supervisor:marcusquinn] 0 queued, 0 working, 7 in review, 7 blocked at 09:17 UTC #2199

Closed

t1160: Mark Dual-CLI Architecture complete — all subtasks merged #2316

Merged

	pool_record_dispatch "$pool_container_id" "$task_id" 2>/dev/null \|\| true
	pool_record_dispatch "$pool_container_id" "$task_id" \|\| true

-	local total healthy unhealthy rate_limited stopped failed
-	total=$(db "$SUPERVISOR_DB" "SELECT COUNT(*) FROM container_pool;" 2>/dev/null || echo "0")
-	healthy=$(db "$SUPERVISOR_DB" "SELECT COUNT(*) FROM container_pool WHERE status='healthy';" 2>/dev/null || echo "0")
-	unhealthy=$(db "$SUPERVISOR_DB" "SELECT COUNT(*) FROM container_pool WHERE status='unhealthy';" 2>/dev/null || echo "0")
-	rate_limited=$(db "$SUPERVISOR_DB" "SELECT COUNT(*) FROM container_pool WHERE status='rate_limited';" 2>/dev/null || echo "0")
-	stopped=$(db "$SUPERVISOR_DB" "SELECT COUNT(*) FROM container_pool WHERE status='stopped';" 2>/dev/null || echo "0")
-	failed=$(db "$SUPERVISOR_DB" "SELECT COUNT(*) FROM container_pool WHERE status='failed';" 2>/dev/null || echo "0")
+local total healthy unhealthy rate_limited stopped failed
+local stats_line
+stats_line=$(db -separator ' ' "$SUPERVISOR_DB" "
+    SELECT
+        COALESCE(COUNT(*), 0),
+        COALESCE(SUM(CASE WHEN status = 'healthy' THEN 1 ELSE 0 END), 0),
+        COALESCE(SUM(CASE WHEN status = 'unhealthy' THEN 1 ELSE 0 END), 0),
+        COALESCE(SUM(CASE WHEN status = 'rate_limited' THEN 1 ELSE 0 END), 0),
+        COALESCE(SUM(CASE WHEN status = 'stopped' THEN 1 ELSE 0 END), 0),
+        COALESCE(SUM(CASE WHEN status = 'failed' THEN 1 ELSE 0 END), 0)
+    FROM container_pool;
+" || echo "0 0 0 0 0 0")
+read -r total healthy unhealthy rate_limited stopped failed <<<"$stats_line"

Conversation

marcusquinn commented Feb 23, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

What's included

New files

Modified files

Features

Quality

Summary by CodeRabbit

Release Notes

Uh oh!

gemini-code-assist bot commented Feb 23, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

coderabbitai bot commented Feb 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram

Estimated code review effort

Poem

Uh oh!

github-actions bot commented Feb 23, 2026

🔍 Code Quality Report

📈 Current Quality Metrics

Uh oh!

sonarqubecloud bot commented Feb 23, 2026

Quality Gate passed

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Feb 23, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Feb 23, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 23, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 23, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 23, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 23, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 23, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

marcusquinn commented Feb 23, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Feb 23, 2026 •

edited

Loading