-
Notifications
You must be signed in to change notification settings - Fork 9
t1423: Add priority-class worker reservations for per-repo concurrency fairness #3966
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -41,10 +41,29 @@ MAX_WORKERS=$(cat ~/.aidevops/logs/pulse-max-workers 2>/dev/null || echo 4) | |
| # Count running workers (only .opencode binaries, not node launchers) | ||
| WORKER_COUNT=$(ps axo command | grep '/full-loop' | grep '\.opencode' | grep -v grep | wc -l | tr -d ' ') | ||
| AVAILABLE=$((MAX_WORKERS - WORKER_COUNT)) | ||
|
|
||
| # Priority-class allocations (t1423) — read from pre-fetched state | ||
| # The "Priority-Class Worker Allocations" section in the pre-fetched state | ||
| # shows PRODUCT_MIN and TOOLING_MAX. Read these values: | ||
| PRODUCT_MIN=$(grep '^PRODUCT_MIN=' ~/.aidevops/logs/pulse-priority-allocations 2>/dev/null | cut -d= -f2 || echo 0) | ||
| TOOLING_MAX=$(grep '^TOOLING_MAX=' ~/.aidevops/logs/pulse-priority-allocations 2>/dev/null | cut -d= -f2 || echo "$MAX_WORKERS") | ||
| ``` | ||
|
|
||
| If `AVAILABLE <= 0`: you can still merge ready PRs, but don't dispatch new workers. | ||
|
|
||
| ### Priority-class enforcement (t1423) | ||
|
|
||
| Worker slots are partitioned between **product** repos (`"priority": "product"` in repos.json) and **tooling** repos (`"priority": "tooling"`). Product repos get a guaranteed minimum share (default 60%) to prevent tooling hygiene from starving user-facing work. | ||
|
|
||
| **Before dispatching each worker, apply this check:** | ||
|
|
||
| 1. Determine the target repo's priority class (from the pre-fetched state repo header or repos.json). | ||
| 2. Count running workers per class: scan the Active Workers section — match each worker's `--dir` path to a repo in repos.json to determine its class. | ||
| 3. **If dispatching a tooling worker:** check whether product-class workers are using fewer than `PRODUCT_MIN` slots. If `product_active < PRODUCT_MIN` AND product repos have pending work (open issues or failing PRs), the remaining product slots are **reserved** — skip the tooling dispatch and look for product work instead. | ||
| 4. **If dispatching a product worker:** always proceed — product has no ceiling (only a floor). | ||
| 5. **Exemptions:** Merges (priority 1) and CI-fix dispatches (priority 2) are exempt from class checks — they always proceed regardless of class. | ||
| 6. **Soft reservation:** When product repos have no pending work (no open issues, no failing-CI PRs, no orphaned PRs), their reserved slots become available for tooling. The reservation protects product work when it exists, not when it doesn't. | ||
|
Comment on lines
+62
to
+65
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Reserve product slots for any dispatchable product work, not just issues/failing PRs. This check treats product repos as idle when they only have mission features, salvage work, review-fix work, or approved debt tasks pending. In that case tooling can still consume the reserved pool, which defeats the fairness guarantee for product repos. Suggested fix-3. **If dispatching a tooling worker:** check whether product-class workers are using fewer than `PRODUCT_MIN` slots. If `product_active < PRODUCT_MIN` AND product repos have pending work (open issues or failing PRs), the remaining product slots are **reserved** — skip the tooling dispatch and look for product work instead.
+3. **If dispatching a tooling worker:** check whether product-class workers are using fewer than `PRODUCT_MIN` slots. If `product_active < PRODUCT_MIN` AND product repos have any dispatchable pending work (for example: open issues, failing/review-fix PRs, orphaned/salvage PRs, active mission features, or approved debt tasks), the remaining product slots are **reserved** — skip the tooling dispatch and look for product work instead.
@@
-6. **Soft reservation:** When product repos have no pending work (no open issues, no failing-CI PRs, no orphaned PRs), their reserved slots become available for tooling. The reservation protects product work when it exists, not when it doesn't.
+6. **Soft reservation:** When product repos have no dispatchable pending work, their reserved slots become available for tooling. The reservation protects product work when it exists, not when it doesn't.🧰 Tools🪛 LanguageTool[style] ~63-~63: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym. (ENGLISH_WORD_REPEAT_BEGINNING_RULE) 🤖 Prompt for AI Agents |
||
|
|
||
| ## Step 2: Use Pre-Fetched State | ||
|
|
||
| **The wrapper has ALREADY fetched open PRs and issues for all pulse-enabled repos.** The data is in your prompt above (between `--- PRE-FETCHED STATE ---` markers). Do NOT re-fetch with `gh pr list` or `gh issue list` — that wastes time and was the root cause of the "only processes first repo" bug (the agent would spend all its context analyzing the first repo's fetch results and never reach the others). | ||
|
|
@@ -610,7 +629,7 @@ batch-strategy-helper.sh validate --tasks "$TASKS_JSON" | |
| 2. PRs with failing CI or review feedback → fix (uses a slot, but closer to done than new issues) | ||
| 3. Issues labelled `priority:high` or `bug` | ||
| 4. Active mission features (keeps multi-day projects moving — see Step 3.5) | ||
| 5. Product repos (`"priority": "product"` in repos.json) over tooling | ||
| 5. Product repos (`"priority": "product"` in repos.json) over tooling — **enforced by priority-class reservations (t1423)**. Product repos have `PRODUCT_MIN` reserved slots; tooling cannot consume them when product work is pending. See "Priority-class enforcement" in Step 1. | ||
| 6. Smaller/simpler tasks over large ones (faster throughput) | ||
| 7. `quality-debt` issues (unactioned review feedback from merged PRs) | ||
| 8. `simplification-debt` issues (human-approved simplification opportunities) | ||
|
|
@@ -1027,7 +1046,7 @@ Output a brief summary of what you did (past tense), then exit. | |
| 3. **NEVER close an issue without a comment.** The comment must explain why and link to the PR(s) or evidence. Silent closes are audit failures. | ||
| 4. **NEVER use `claude` CLI.** Always `opencode run`. | ||
| 5. **NEVER include private repo names** in public issue titles/bodies/comments. | ||
| 6. **NEVER exceed MAX_WORKERS.** Count before dispatching. | ||
| 6. **NEVER exceed MAX_WORKERS or violate priority-class reservations.** Count before dispatching. Check class allocations (Step 1) — tooling workers must not consume product-reserved slots when product work is pending. | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Mirror the Step 1 exemptions in the hard rule. Step 1 explicitly allows merges and priority-2 CI-fix dispatches to bypass class reservations, but this hard rule reads like an absolute prohibition. Because the hard-rules section has stronger wording, the agent can end up blocking the exempt cases you intended to keep moving. Suggested fix-6. **NEVER exceed MAX_WORKERS or violate priority-class reservations.** Count before dispatching. Check class allocations (Step 1) — tooling workers must not consume product-reserved slots when product work is pending.
+6. **NEVER exceed MAX_WORKERS.** For normal dispatches, do not violate priority-class reservations. Count before dispatching. Check class allocations (Step 1) — tooling workers must not consume product-reserved slots when product work is pending. **Exemption:** merges and priority-2 CI-fix dispatches still proceed as described in Step 1.🧰 Tools🪛 LanguageTool[style] ~1049-~1049: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym. (ENGLISH_WORD_REPEAT_BEGINNING_RULE) 🤖 Prompt for AI Agents |
||
| 7. **Do your job completely, then exit.** Don't loop or re-analyze — one pass through all repos, act on everything, exit. | ||
| 8. **NEVER create "pulse summary" or "supervisor log" issues.** The pulse runs every 2 minutes — creating an issue per cycle produces hundreds of spam issues per day. Your output text IS the log (it's captured by the wrapper to `~/.aidevops/logs/pulse.log`). The audit trail lives in PR/issue comments on the items you acted on, not in separate summary issues. | ||
| 9. **NEVER create an issue if one already exists for the same task ID.** Before `gh issue create`, check `gh issue list --repo <slug> --search "tNNN" --state all` to see if an issue with that task ID prefix already exists. If it does (open or closed), use the existing one — don't create a duplicate. This applies to both issue-sync-helper and manual issue creation. | ||
|
|
||
| Original file line number | Diff line number | Diff line change | ||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
|
@@ -63,6 +63,7 @@ RAM_RESERVE_MB="${RAM_RESERVE_MB:-8192}" # 8 GB reserved for OS | |||||||||||||||
| MAX_WORKERS_CAP="${MAX_WORKERS_CAP:-8}" # Hard ceiling regardless of RAM | ||||||||||||||||
| QUALITY_SWEEP_INTERVAL="${QUALITY_SWEEP_INTERVAL:-86400}" # 24 hours between sweeps | ||||||||||||||||
| DAILY_PR_CAP="${DAILY_PR_CAP:-5}" # Max PRs created per repo per day (GH#3821) | ||||||||||||||||
| PRODUCT_RESERVATION_PCT="${PRODUCT_RESERVATION_PCT:-60}" # % of worker slots reserved for product repos (t1423) | ||||||||||||||||
|
|
||||||||||||||||
| # Process guard limits (t1398) | ||||||||||||||||
| CHILD_RSS_LIMIT_KB="${CHILD_RSS_LIMIT_KB:-2097152}" # 2 GB default — kill child if RSS exceeds this | ||||||||||||||||
|
|
@@ -82,6 +83,7 @@ RAM_RESERVE_MB=$(_validate_int RAM_RESERVE_MB "$RAM_RESERVE_MB" 8192) | |||||||||||||||
| MAX_WORKERS_CAP=$(_validate_int MAX_WORKERS_CAP "$MAX_WORKERS_CAP" 8) | ||||||||||||||||
| QUALITY_SWEEP_INTERVAL=$(_validate_int QUALITY_SWEEP_INTERVAL "$QUALITY_SWEEP_INTERVAL" 86400) | ||||||||||||||||
| DAILY_PR_CAP=$(_validate_int DAILY_PR_CAP "$DAILY_PR_CAP" 5 1) | ||||||||||||||||
| PRODUCT_RESERVATION_PCT=$(_validate_int PRODUCT_RESERVATION_PCT "$PRODUCT_RESERVATION_PCT" 60 0) | ||||||||||||||||
| CHILD_RSS_LIMIT_KB=$(_validate_int CHILD_RSS_LIMIT_KB "$CHILD_RSS_LIMIT_KB" 2097152 1) | ||||||||||||||||
| CHILD_RUNTIME_LIMIT=$(_validate_int CHILD_RUNTIME_LIMIT "$CHILD_RUNTIME_LIMIT" 1800 1) | ||||||||||||||||
| SHELLCHECK_RSS_LIMIT_KB=$(_validate_int SHELLCHECK_RSS_LIMIT_KB "$SHELLCHECK_RSS_LIMIT_KB" 1048576 1) | ||||||||||||||||
|
|
@@ -312,6 +314,9 @@ prefetch_state() { | |||||||||||||||
| # Append repo hygiene data for LLM triage (t1417) | ||||||||||||||||
| prefetch_hygiene >>"$STATE_FILE" | ||||||||||||||||
|
|
||||||||||||||||
| # Append priority-class worker allocations (t1423) | ||||||||||||||||
| _append_priority_allocations >>"$STATE_FILE" | ||||||||||||||||
|
|
||||||||||||||||
| # Export PULSE_SCOPE_REPOS — comma-separated list of repo slugs that | ||||||||||||||||
| # workers are allowed to create PRs/branches on (t1405, GH#2928). | ||||||||||||||||
| # Workers CAN file issues on any repo (cross-repo self-improvement), | ||||||||||||||||
|
|
@@ -716,6 +721,54 @@ prefetch_active_workers() { | |||||||||||||||
| return 0 | ||||||||||||||||
| } | ||||||||||||||||
|
|
||||||||||||||||
| ####################################### | ||||||||||||||||
| # Append priority-class worker allocations to state file (t1423) | ||||||||||||||||
| # | ||||||||||||||||
| # Reads the allocation file written by calculate_priority_allocations() | ||||||||||||||||
| # and formats it as a section the pulse agent can act on. | ||||||||||||||||
| # | ||||||||||||||||
| # The pulse agent uses this to enforce soft reservations: product repos | ||||||||||||||||
| # get a guaranteed minimum share of worker slots, tooling gets the rest. | ||||||||||||||||
| # When one class has no pending work, the other can use freed slots. | ||||||||||||||||
| # | ||||||||||||||||
| # Output: allocation summary to stdout (appended to STATE_FILE by caller) | ||||||||||||||||
| ####################################### | ||||||||||||||||
| _append_priority_allocations() { | ||||||||||||||||
| local alloc_file="${HOME}/.aidevops/logs/pulse-priority-allocations" | ||||||||||||||||
|
|
||||||||||||||||
| echo "" | ||||||||||||||||
| echo "# Priority-Class Worker Allocations (t1423)" | ||||||||||||||||
| echo "" | ||||||||||||||||
|
|
||||||||||||||||
| if [[ ! -f "$alloc_file" ]]; then | ||||||||||||||||
| echo "- Allocation data not available — using flat pool (no reservations)" | ||||||||||||||||
| echo "" | ||||||||||||||||
| return 0 | ||||||||||||||||
| fi | ||||||||||||||||
|
|
||||||||||||||||
| # Read allocation values | ||||||||||||||||
| local max_workers product_repos tooling_repos product_min tooling_max reservation_pct | ||||||||||||||||
| max_workers=$(grep '^MAX_WORKERS=' "$alloc_file" | cut -d= -f2) || max_workers=4 | ||||||||||||||||
| product_repos=$(grep '^PRODUCT_REPOS=' "$alloc_file" | cut -d= -f2) || product_repos=0 | ||||||||||||||||
| tooling_repos=$(grep '^TOOLING_REPOS=' "$alloc_file" | cut -d= -f2) || tooling_repos=0 | ||||||||||||||||
| product_min=$(grep '^PRODUCT_MIN=' "$alloc_file" | cut -d= -f2) || product_min=0 | ||||||||||||||||
| tooling_max=$(grep '^TOOLING_MAX=' "$alloc_file" | cut -d= -f2) || tooling_max=0 | ||||||||||||||||
| reservation_pct=$(grep '^PRODUCT_RESERVATION_PCT=' "$alloc_file" | cut -d= -f2) || reservation_pct=60 | ||||||||||||||||
|
|
||||||||||||||||
| echo "Worker pool: **${max_workers}** total slots" | ||||||||||||||||
| echo "Product repos (${product_repos}): **${product_min}** reserved slots (${reservation_pct}% minimum)" | ||||||||||||||||
| echo "Tooling repos (${tooling_repos}): **${tooling_max}** slots (remainder)" | ||||||||||||||||
| echo "" | ||||||||||||||||
| echo "**Enforcement rules:**" | ||||||||||||||||
| echo "- Before dispatching a tooling-repo worker, check: are product-repo workers using fewer than ${product_min} slots? If yes, the remaining product slots are reserved — do NOT fill them with tooling work." | ||||||||||||||||
| echo "- If product repos have no pending work (no open issues, no failing PRs), their reserved slots become available for tooling." | ||||||||||||||||
| echo "- If all ${max_workers} slots are needed for product work, tooling gets 0 (product reservation is a minimum, not a maximum)." | ||||||||||||||||
| echo "- Merges (priority 1) and CI fixes (priority 2) are exempt — they always proceed regardless of class." | ||||||||||||||||
| echo "" | ||||||||||||||||
|
|
||||||||||||||||
| return 0 | ||||||||||||||||
| } | ||||||||||||||||
|
|
||||||||||||||||
| ####################################### | ||||||||||||||||
| # Pre-fetch repo hygiene data for LLM triage (t1417) | ||||||||||||||||
| # | ||||||||||||||||
|
|
@@ -2948,6 +3001,7 @@ main() { | |||||||||||||||
| cleanup_worktrees | ||||||||||||||||
| cleanup_stashes | ||||||||||||||||
| calculate_max_workers | ||||||||||||||||
| calculate_priority_allocations | ||||||||||||||||
| check_session_count >/dev/null | ||||||||||||||||
|
|
||||||||||||||||
| # Run housekeeping BEFORE the pulse — these are shell-level operations | ||||||||||||||||
|
|
@@ -3091,6 +3145,91 @@ calculate_max_workers() { | |||||||||||||||
| return 0 | ||||||||||||||||
| } | ||||||||||||||||
|
|
||||||||||||||||
| ####################################### | ||||||||||||||||
| # Calculate priority-class worker allocations (t1423) | ||||||||||||||||
| # | ||||||||||||||||
| # Reads repos.json to count product vs tooling repos, then computes | ||||||||||||||||
| # per-class slot reservations based on PRODUCT_RESERVATION_PCT. | ||||||||||||||||
| # | ||||||||||||||||
| # Product repos get a guaranteed minimum share of worker slots. | ||||||||||||||||
| # Tooling repos get the remainder. When one class has no pending work, | ||||||||||||||||
| # the other class can use the freed slots (soft reservation). | ||||||||||||||||
| # | ||||||||||||||||
| # Output: writes allocation data to pulse-priority-allocations file | ||||||||||||||||
| # and appends a summary section to STATE_FILE for the pulse agent. | ||||||||||||||||
| # | ||||||||||||||||
| # Depends on: calculate_max_workers() having run first (reads pulse-max-workers) | ||||||||||||||||
| ####################################### | ||||||||||||||||
| calculate_priority_allocations() { | ||||||||||||||||
| local repos_json="${REPOS_JSON}" | ||||||||||||||||
| local max_workers_file="${HOME}/.aidevops/logs/pulse-max-workers" | ||||||||||||||||
| local alloc_file="${HOME}/.aidevops/logs/pulse-priority-allocations" | ||||||||||||||||
|
|
||||||||||||||||
| if [[ ! -f "$repos_json" ]] || ! command -v jq &>/dev/null; then | ||||||||||||||||
| echo "[pulse-wrapper] repos.json or jq not available — skipping priority allocations" >>"$LOGFILE" | ||||||||||||||||
| return 0 | ||||||||||||||||
|
Comment on lines
+3168
to
+3170
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Delete stale allocation state on the skip path. If Suggested fix if [[ ! -f "$repos_json" ]] || ! command -v jq &>/dev/null; then
+ rm -f "$alloc_file"
echo "[pulse-wrapper] repos.json or jq not available — skipping priority allocations" >>"$LOGFILE"
return 0
fi📝 Committable suggestion
Suggested change
🤖 Prompt for AI Agents |
||||||||||||||||
| fi | ||||||||||||||||
|
|
||||||||||||||||
| local max_workers | ||||||||||||||||
| max_workers=$(cat "$max_workers_file" 2>/dev/null || echo 4) | ||||||||||||||||
| [[ "$max_workers" =~ ^[0-9]+$ ]] || max_workers=4 | ||||||||||||||||
|
|
||||||||||||||||
| # Count pulse-enabled repos by priority class (single jq pass) | ||||||||||||||||
| local product_repos tooling_repos | ||||||||||||||||
| read -r product_repos tooling_repos < <(jq -r ' | ||||||||||||||||
| .initialized_repos | | ||||||||||||||||
| map(select(.pulse == true and (.local_only // false) == false and .slug != "")) | | ||||||||||||||||
| [ | ||||||||||||||||
| (map(select(.priority == "product")) | length), | ||||||||||||||||
| (map(select(.priority == "tooling")) | length) | ||||||||||||||||
| ] | @tsv | ||||||||||||||||
| ' "$repos_json" 2>/dev/null) || true | ||||||||||||||||
| product_repos=${product_repos:-0} | ||||||||||||||||
| tooling_repos=${tooling_repos:-0} | ||||||||||||||||
| [[ "$product_repos" =~ ^[0-9]+$ ]] || product_repos=0 | ||||||||||||||||
| [[ "$tooling_repos" =~ ^[0-9]+$ ]] || tooling_repos=0 | ||||||||||||||||
|
|
||||||||||||||||
| # Calculate reservations | ||||||||||||||||
| # product_min = ceil(max_workers * PRODUCT_RESERVATION_PCT / 100) | ||||||||||||||||
| # Using integer arithmetic: ceil(a/b) = (a + b - 1) / b | ||||||||||||||||
| local product_min tooling_max | ||||||||||||||||
| if [[ "$product_repos" -eq 0 ]]; then | ||||||||||||||||
| # No product repos — all slots available for tooling | ||||||||||||||||
| product_min=0 | ||||||||||||||||
| tooling_max="$max_workers" | ||||||||||||||||
| elif [[ "$tooling_repos" -eq 0 ]]; then | ||||||||||||||||
| # No tooling repos — all slots available for product | ||||||||||||||||
| product_min="$max_workers" | ||||||||||||||||
| tooling_max=0 | ||||||||||||||||
| else | ||||||||||||||||
| product_min=$(((max_workers * PRODUCT_RESERVATION_PCT + 99) / 100)) | ||||||||||||||||
| # Ensure product_min doesn't exceed max_workers | ||||||||||||||||
| if [[ "$product_min" -gt "$max_workers" ]]; then | ||||||||||||||||
| product_min="$max_workers" | ||||||||||||||||
| fi | ||||||||||||||||
| # Ensure at least 1 slot for tooling when tooling repos exist | ||||||||||||||||
| # but only when there are multiple slots to distribute (with 1 slot, | ||||||||||||||||
| # product keeps it — the reservation is a minimum guarantee) | ||||||||||||||||
| if [[ "$max_workers" -gt 1 && "$product_min" -ge "$max_workers" && "$tooling_repos" -gt 0 ]]; then | ||||||||||||||||
| product_min=$((max_workers - 1)) | ||||||||||||||||
| fi | ||||||||||||||||
| tooling_max=$((max_workers - product_min)) | ||||||||||||||||
| fi | ||||||||||||||||
|
|
||||||||||||||||
| # Write allocation file (key=value, readable by pulse.md) | ||||||||||||||||
| { | ||||||||||||||||
| echo "MAX_WORKERS=${max_workers}" | ||||||||||||||||
| echo "PRODUCT_REPOS=${product_repos}" | ||||||||||||||||
| echo "TOOLING_REPOS=${tooling_repos}" | ||||||||||||||||
| echo "PRODUCT_MIN=${product_min}" | ||||||||||||||||
| echo "TOOLING_MAX=${tooling_max}" | ||||||||||||||||
| echo "PRODUCT_RESERVATION_PCT=${PRODUCT_RESERVATION_PCT}" | ||||||||||||||||
| } >"$alloc_file" | ||||||||||||||||
|
|
||||||||||||||||
| echo "[pulse-wrapper] Priority allocations: product_min=${product_min}, tooling_max=${tooling_max} (${product_repos} product, ${tooling_repos} tooling repos, ${max_workers} total slots)" >>"$LOGFILE" | ||||||||||||||||
| return 0 | ||||||||||||||||
| } | ||||||||||||||||
|
|
||||||||||||||||
| # Only run main when executed directly, not when sourced. | ||||||||||||||||
| # The pulse agent sources this file to access helper functions | ||||||||||||||||
| # (check_external_contributor_pr, check_permission_failure_pr) | ||||||||||||||||
|
|
||||||||||||||||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,34 @@ | ||
| # t1423: Priority-class worker reservations for per-repo concurrency fairness | ||
|
|
||
| ## Session Origin | ||
|
|
||
| Interactive session, 2026-03-09. User asked whether workers still share concurrency across all repos.json. Confirmed yes — global pool with no per-class partitioning. User chose option 2 (priority-class reservations) over per-repo min/max or status quo. | ||
|
|
||
| ## What | ||
|
|
||
| Add priority-class worker slot reservations to the pulse supervisor. Product repos (`"priority": "product"` in repos.json) get a guaranteed minimum share of worker slots (default 60%). Tooling repos get the remainder. Soft reservation — when one class has no pending work, the other can use freed slots. | ||
|
|
||
| ## Why | ||
|
|
||
| Without reservations, tooling hygiene work (quality-debt, simplification-debt, CI fixes) can consume all worker slots before product repos' new features get dispatched. The existing priority order in pulse.md (item 5: "product over tooling") is LLM guidance, not enforcement — a busy tooling repo with many failing-CI PRs (priority 1-2) consumes all slots before product repos' lower-priority issues get a chance. | ||
|
|
||
| ## How | ||
|
|
||
| 1. **pulse-wrapper.sh**: Add `PRODUCT_RESERVATION_PCT` config (default 60%), `calculate_priority_allocations()` function that reads repos.json, counts product vs tooling repos, computes `PRODUCT_MIN` and `TOOLING_MAX`, writes to `~/.aidevops/logs/pulse-priority-allocations`. | ||
| 2. **pulse-wrapper.sh**: Add `_append_priority_allocations()` to format allocation data for the STATE_FILE. | ||
| 3. **pulse.md**: Update Step 1 to read allocation file and enforce class reservations before dispatch. Update priority order item 5 to reference enforcement. Update Hard Rule 6. | ||
|
|
||
| ## Acceptance Criteria | ||
|
|
||
| - [ ] `calculate_priority_allocations()` correctly computes allocations for: normal case, small pool, 1 worker, no tooling, no product repos | ||
| - [ ] Allocation data appears in pulse state file | ||
| - [ ] pulse.md Step 1 includes class enforcement guidance | ||
| - [ ] ShellCheck clean (SC1091 only) | ||
| - [ ] All existing pulse-wrapper tests still pass | ||
|
|
||
| ## Context | ||
|
|
||
| - 8 pulse-enabled repos: 4 product (cloudron-netbird-app, turbostarter-plus, awardsapp, essentials.com), 4 tooling (aidevops, aidevops.sh, quickfile-mcp, aidevops-cloudron-app) | ||
| - Current MAX_WORKERS is RAM-based: `(free_mb - 8GB) / 1GB`, capped at 8 | ||
| - DAILY_PR_CAP=5 per repo already prevents PR flood, but doesn't prevent worker slot starvation | ||
| - Quality-debt cap (30%) and simplification-debt cap (10%) are global against MAX_WORKERS |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In alignment with repository guidelines to avoid blanket error suppression, it's better to remove
2>/dev/null. This change will make potential issues like file-not-found or permission errors visible in the logs, which is valuable for debugging, while the||construct will still correctly handle default values when the file is missing or a key isn't found.References