Skip to content

t1449: reallocate pulse slots when product repos are capped#4186

Merged
marcusquinn merged 1 commit intomainfrom
bugfix/t1449-fill-to-cap-redistribution
Mar 12, 2026
Merged

t1449: reallocate pulse slots when product repos are capped#4186
marcusquinn merged 1 commit intomainfrom
bugfix/t1449-fill-to-cap-redistribution

Conversation

@marcusquinn
Copy link
Owner

@marcusquinn marcusquinn commented Mar 12, 2026

Summary

  • make pulse reservations cap-aware by counting dispatchable product repos (excluding daily-PR-capped repos) before computing product/tooling slot targets
  • update pulse allocation guidance to enforce soft reservations and explicit no-idle behavior when runnable scoped work exists
  • extend pulse worker policy with a coaching-before-kill step and a fill-to-cap post-condition each cycle

Why

Pulse was underutilizing concurrency when product repos hit PR caps, leaving slots idle despite large system/tooling backlog. This change removes that blocker and requires redistribution to keep workers productive.

Closes #4185

Summary by CodeRabbit

  • Documentation

    • Added coaching intervention guidance for handling stuck workers before termination.
    • Clarified product and tooling capacity reservation policies.
  • Improvements

    • Extended idle worker detection timeout to reduce false positives during triage.
    • Improved work allocation between product and tooling tasks with enhanced dispatchability tracking.

@gemini-code-assist
Copy link

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly improves the Pulse system's resource utilization and worker management. It addresses a key inefficiency where worker slots remained idle when product repositories hit their daily PR caps, despite available system or tooling work. By making reservation calculations dynamic based on actual dispatchability and implementing a more flexible allocation strategy, Pulse can now intelligently reallocate resources, ensuring workers remain productive and the overall backlog is processed more efficiently. The changes also refine worker termination policies to include a coaching phase, aiming to resolve issues before outright killing a worker.

Highlights

  • Cap-Aware Pulse Reservations: Pulse now considers daily PR caps when calculating product repository dispatchability, ensuring slots are not left idle due to artificial limits.
  • Flexible Slot Allocation: Allocation guidance has been updated to treat reservations as soft targets, promoting immediate reallocation of unused slots to other work classes when candidates are available.
  • Enhanced Worker Policy: Introduced a "coaching-before-kill" step for stalled workers and a "fill-to-cap" post-condition to maximize worker utilization by preventing idle slots.
Changelog
  • .agents/scripts/commands/pulse.md
    • Added guidance for a "coaching-before-kill" intervention for stalled workers.
    • Introduced a "fill-to-cap" post-condition to ensure maximum worker utilization.
    • Updated reservation enforcement rules to clarify soft targets and reallocation.
  • .agents/scripts/pulse-wrapper.sh
    • Increased PULSE_IDLE_TIMEOUT to 10 minutes to reduce false positives.
    • Implemented logic to dynamically calculate dispatchable product repositories, accounting for daily PR caps.
    • Modified allocation output to display dispatchable product repository count.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Mar 12, 2026

Caution

Review failed

Pull request was closed or merged during review

Walkthrough

This PR enhances the pulse dispatch system with smarter worker allocation and improved lifecycle management. It adds coaching-intervention guidance before killing stuck workers, enforces fill-to-cap behavior for active slots, introduces dispatchable product repo tracking accounting for daily PR caps, and increases the idle timeout threshold to reduce false positives.

Changes

Cohort / File(s) Summary
Pulse Dispatch Guidance
.agents/scripts/commands/pulse.md
Added coaching-intervention guidance before killing thrashing workers (read transcript, attempt targeted coaching, re-dispatch with narrower scope, only kill if retry fails). Added fill-to-cap post-condition checks to prevent idle slots when capacity remains. Clarified product/tooling slot reservations and reallocation during PR caps.
Pulse Wrapper Allocation Logic
.agents/scripts/pulse-wrapper.sh
Extended PULSE_IDLE_TIMEOUT from 300 to 600 seconds to reduce false positives. Introduced dispatchable_product_repos metric that filters product repos by daily PR cap availability. Updated priority allocation to dynamically shift slots from product to tooling when dispatchable repos are exhausted. Enhanced output to expose dispatchability status and updated DISPATCHABLE_PRODUCT_REPOS state propagation.

Sequence Diagram(s)

sequenceDiagram
    participant Pulse as Pulse Wrapper
    participant PrCap as Daily PR Cap Check
    participant Alloc as Priority Allocator
    participant Dispatch as Worker Dispatcher
    participant Workers as Active Workers

    Pulse->>PrCap: Calculate dispatchable_product_repos<br/>(filter by cap availability)
    PrCap-->>Pulse: dispatchable_count
    
    Pulse->>Alloc: compute_priority_allocations<br/>(with dispatchable_count)
    
    alt dispatchable_product_repos > 0
        Alloc->>Alloc: Allocate slots by product_min ratio
    else dispatchable_product_repos == 0
        Alloc->>Alloc: Shift all slots to tooling<br/>(product_min = 0)
    end
    
    Alloc-->>Pulse: allocation {product, tooling}
    Pulse->>Dispatch: Fill to cap<br/>(dispatch until MAX_WORKERS or no candidates)
    Dispatch->>Workers: Dispatch product/tooling<br/>per allocation
    Workers-->>Dispatch: Checkpoint reached
    Dispatch-->>Pulse: Updated active_workers
    
    alt active_workers < MAX_WORKERS
        Pulse-->>Dispatch: Continue fill-to-cap
    else active_workers == MAX_WORKERS
        Pulse-->>Pulse: Pulse cycle complete
    end
Loading

Estimated Code Review Effort

🎯 3 (Moderate) | ⏱️ ~22 minutes

Possibly Related PRs

Suggested Labels

enhancement

Poem

🤖 Coaching before we kill, slots we smartly fill
No worker left behind by cap-aware design
When product repos rest, tooling gets the test
Soft-allocation flows where dynamism grows 🚀

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title precisely reflects the main objective: reallocating pulse worker slots when product repositories hit daily PR caps, which is the core focus of the changeset.
Linked Issues check ✅ Passed The PR implements all coding requirements from #4185: cap-aware soft reservations with slot reallocation, fill-to-cap enforcement, extended idle timeout, and coaching-before-kill worker policy.
Out of Scope Changes check ✅ Passed All changes are directly scoped to #4185 objectives: pulse wrapper metric calculations, allocation logic, worker policy guidance, and documentation updates; no extraneous modifications detected.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
  • 📝 Generate docstrings (stacked PR)
  • 📝 Generate docstrings (commit on current branch)
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch bugfix/t1449-fill-to-cap-redistribution

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions
Copy link

🔍 Code Quality Report

�[0;35m[MONITOR]�[0m Code Review Monitoring Report

�[0;34m[INFO]�[0m Latest Quality Status:
SonarCloud: 0 bugs, 0 vulnerabilities, 402 code smells

�[0;34m[INFO]�[0m Recent monitoring activity:
Thu Mar 12 00:12:50 UTC 2026: Code review monitoring started
Thu Mar 12 00:12:50 UTC 2026: SonarCloud - Bugs: 0, Vulnerabilities: 0, Code Smells: 402

📈 Current Quality Metrics

  • BUGS: 0
  • CODE SMELLS: 402
  • VULNERABILITIES: 0

Generated on: Thu Mar 12 00:12:53 UTC 2026


Generated by AI DevOps Framework Code Review Monitoring

@sonarqubecloud
Copy link

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a significant improvement to the pulse worker allocation logic by making it aware of daily PR caps on product repositories. This allows for more efficient use of available worker slots by reallocating them to tooling/system work when product repos are blocked. The accompanying documentation changes in pulse.md are clear and accurately reflect the new policies. My main suggestion is to refactor the logic for counting daily PRs in pulse-wrapper.sh to be more efficient by using a single gh search prs API call instead of making a separate call for each repository in a loop, ensuring all relevant PRs are retrieved by using the --paginate flag.

Comment on lines +1891 to +1904
if [[ "$product_repos" -gt 0 && "$DAILY_PR_CAP" -gt 0 ]]; then
while IFS= read -r slug; do
[[ -n "$slug" ]] || continue
local pr_json daily_pr_count
pr_json=$(gh pr list --repo "$slug" --state open --json createdAt --limit 100 2>/dev/null) || pr_json="[]"
daily_pr_count=$(echo "$pr_json" | jq --arg today "$today_utc" '[.[] | select(.createdAt | startswith($today))] | length' 2>/dev/null) || daily_pr_count=0
[[ "$daily_pr_count" =~ ^[0-9]+$ ]] || daily_pr_count=0
if [[ "$daily_pr_count" -lt "$DAILY_PR_CAP" ]]; then
dispatchable_product_repos=$((dispatchable_product_repos + 1))
fi
done < <(jq -r '.initialized_repos[] | select(.pulse == true and (.local_only // false) == false and .slug != "" and .priority == "product") | .slug' "$repos_json" 2>/dev/null)
else
dispatchable_product_repos="$product_repos"
fi

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

This loop makes a gh pr list API call for every product repository, which can be inefficient and slow if there are many repositories. You can significantly improve performance by fetching the PR counts for all relevant repositories in a single API call using gh search prs. Additionally, to ensure all relevant PRs are retrieved, the --paginate flag should be used as per repository guidelines for gh commands.

Here's a suggested refactoring that first collects all product repo slugs, then uses gh search prs with --paginate to get all open PRs created today for those repos, and finally processes the results locally to count dispatchable repos. This reduces N API calls to just one and ensures completeness.

if [[ "$product_repos" -gt 0 && "$DAILY_PR_CAP" -gt 0 ]]; then
    local product_repo_slugs_str
    product_repo_slugs_str=$(jq -r '.initialized_repos[] | select(.pulse == true and (.local_only // false) == false and .slug != "" and .priority == "product") | .slug' "$repos_json" 2>/dev/null)

    if [[ -n "$product_repo_slugs_str" ]]; then
        local search_args=()
        while IFS= read -r slug; do
            [[ -n "$slug" ]] && search_args+=(--repo "$slug")
        done <<< "$product_repo_slugs_str"

        # Get daily PR counts for all product repos in a single, more efficient API call
        local pr_counts_json
        pr_counts_json=$(gh search prs --created ">=today_utc" --state open "${search_args[@]}" --json repo --paginate | jq 'group_by(.repo.nameWithOwner) | map({(.[0].repo.nameWithOwner): length}) | add' 2>/dev/null) || pr_counts_json="{}"

        # Count dispatchable repos by checking the fetched counts
        while IFS= read -r slug; do
            [[ -n "$slug" ]] || continue
            local daily_pr_count
            daily_pr_count=$(echo "$pr_counts_json" | jq -r --arg slug "$slug" '.[$slug] // 0')
            [[ "$daily_pr_count" =~ ^[0-9]+$ ]] || daily_pr_count=0
            if [[ "$daily_pr_count" -lt "$DAILY_PR_CAP" ]]; then
                dispatchable_product_repos=$((dispatchable_product_repos + 1))
            fi
        done <<< "$product_repo_slugs_str"
    fi
else
    dispatchable_product_repos="$product_repos"
fi
References
  1. When fetching a list of items from the GitHub API with the gh command, use the --paginate flag to ensure all items are retrieved, not just the first page.

@marcusquinn marcusquinn merged commit cdcd6af into main Mar 12, 2026
30 of 31 checks passed
@marcusquinn marcusquinn deleted the bugfix/t1449-fill-to-cap-redistribution branch March 12, 2026 00:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

t1449: Pulse: enforce fill-to-cap with cap-aware slot redistribution

1 participant