Skip to content

fix: bypass rate-limit-only Gate cancellations - proceed with work#702

Merged
stranske merged 7 commits intomainfrom
fix/bypass-rate-limit-gate
Jan 9, 2026
Merged

fix: bypass rate-limit-only Gate cancellations - proceed with work#702
stranske merged 7 commits intomainfrom
fix/bypass-rate-limit-gate

Conversation

@stranske
Copy link
Copy Markdown
Owner

@stranske stranske commented Jan 9, 2026

Source: Issue #696

Automated Status Summary

Scope

Part of Phase 3 workflow rollout validation per langchain-post-code-rollout.md

Part of Phase 3 workflow rollout validation per langchain-post-code-rollout.md

Context for Agent

Design Decisions & Constraints

  • Create a decomposable issue with labels/milestone (The agent cannot modify repository settings, which may include labels and milestones. | Provide a predefined issue template with labels/milestones that can be used.)
  • DCPT03 preserves metadata on children (DCPT03 | List specific metadata that should be preserved.)
  • Bullets used for non-tasks in the 'Acceptance Criteria' section should be formatted as checkboxes.
  • The issue is generally well-structured but requires more clarity in tasks and acceptance criteria. Additionally, some tasks may be blocked due to agent limitations.

Related Issues/PRs

References

Blockers & Dependencies

  • The issue is generally well-structured but requires more clarity in tasks and acceptance criteria. Additionally, some tasks may be blocked due to agent limitations.

Context for Agent

Design Decisions & Constraints

  • Create a decomposable issue with labels/milestone (The agent cannot modify repository settings, which may include labels and milestones. | Provide a predefined issue template with labels/milestones that can be used.)
  • DCPT03 preserves metadata on children (DCPT03 | List specific metadata that should be preserved.)
  • Bullets used for non-tasks in the 'Acceptance Criteria' section should be formatted as checkboxes.
  • The issue is generally well-structured but requires more clarity in tasks and acceptance criteria. Additionally, some tasks may be blocked due to agent limitations.
  • | Keepalive | ✅ enabled |

Related Issues/PRs

References

Blockers & Dependencies

  • The issue is generally well-structured but requires more clarity in tasks and acceptance criteria. Additionally, some tasks may be blocked due to agent limitations.

Tasks

  • Create a large issue with 8+ subtasks (e.g., 'Build user dashboard with auth, profile, settings, notifications, themes, export, import, admin') in the test repo.
  • Create an atomic issue (e.g., 'Fix null check in parser') in the test repo.
  • Create a decomposable issue with labels/milestone in the test repo.
  • Create a large issue with 8+ subtasks (e.g., 'Build user dashboard with auth, profile, settings, notifications, themes, export, import, admin') in the test repo.
  • Create an atomic issue (e.g., 'Fix null check in parser') in the test repo.
  • Create a decomposable issue with labels/milestone in the test repo.

Acceptance criteria

  • DCPT01 creates linked child issues.
  • DCPT02 leaves atomic issues alone.
  • DCPT03 preserves metadata on children.
  • DCPT01 creates linked child issues.
  • DCPT02 leaves atomic issues alone.
  • DCPT03 preserves metadata on children.
  • DCPT01 creates linked child issues
  • DCPT02 leaves atomic issues alone
  • DCPT03 preserves metadata on children
  • ### Test Repo
  • Run tests in Manager-Database or another consumer repo.

Head SHA: 9c87503
Latest Runs: ✅ success — Gate
Required: gate: ✅ success

Workflow / Job Result Logs
Agents PR meta manager ❔ in progress View run
CI Autofix Loop ✅ success View run
Gate ✅ success View run
Health 40 Sweep ✅ success View run
Health 44 Gate Branch Protection ✅ success View run
Health 45 Agents Guard ✅ success View run
Health 50 Security Scan ✅ success View run
Keepalive E2E ❔ startup failure View run
Maint 52 Validate Workflows ✅ success View run
PR 11 - Minimal invariant CI ✅ success View run
Selftest CI ✅ success View run
Validate Sync Manifest ✅ success View run

…ately

Rate limits are infrastructure noise, not code quality issues. When Gate
is cancelled only due to API rate limits (not actual test failures),
the keepalive loop should proceed with work immediately rather than
deferring or waiting.

This change:
- Detects when Gate cancellation was due to rate limits only
- Immediately continues with 'run' action instead of 'defer'
- Sets reason as 'bypass-rate-limit-gate' for tracking
- Preserves the defer fallback only for non-rate-limit cancellations

This prevents PRs from getting stuck in 'defer' state waiting for
scheduled retry workflows when the underlying issue is just
temporary rate limiting from GitHub APIs.

Affected PRs (examples):
- #696, #698, #699 were stuck with 'gate-cancelled-rate-limit-transient'
Copilot AI review requested due to automatic review settings January 9, 2026 16:08
@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Jan 9, 2026

Automated Status Summary

Head SHA: bc44e0e
Latest Runs: ⏳ pending — Gate
Required contexts: Gate / gate, Health 45 Agents Guard / Enforce agents workflow protections
Required: core tests (3.11): ⏳ pending, core tests (3.12): ⏳ pending, docker smoke: ⏳ pending, gate: ⏳ pending

Workflow / Job Result Logs
(no jobs reported) ⏳ pending

Coverage Overview

  • Coverage history entries: 1

Coverage Trend

Metric Value
Current 92.21%
Baseline 85.00%
Delta +7.21%
Minimum 70.00%
Status ✅ Pass

Top Coverage Hotspots (lowest coverage)

File Coverage Missing
scripts/workflow_health_check.py 62.6% 28
scripts/classify_test_failures.py 62.9% 37
scripts/ledger_validate.py 65.3% 63
scripts/mypy_return_autofix.py 82.6% 11
scripts/ledger_migrate_base.py 85.5% 13
scripts/fix_cosmetic_aggregate.py 92.3% 1
scripts/coverage_history_append.py 92.8% 2
scripts/workflow_validator.py 93.3% 4
scripts/update_autofix_expectations.py 93.9% 1
scripts/pr_metrics_tracker.py 95.7% 3
scripts/generate_residual_trend.py 96.6% 1
scripts/build_autofix_pr_comment.py 97.0% 2
scripts/aggregate_agent_metrics.py 97.2% 0
scripts/fix_numpy_asserts.py 98.1% 0
scripts/sync_test_dependencies.py 98.3% 1

Updated automatically; will refresh on subsequent CI/Docker completions.


Keepalive checklist

Scope

Part of Phase 3 workflow rollout validation per langchain-post-code-rollout.md

Part of Phase 3 workflow rollout validation per langchain-post-code-rollout.md

Context for Agent

Design Decisions & Constraints

  • Create a decomposable issue with labels/milestone (The agent cannot modify repository settings, which may include labels and milestones. | Provide a predefined issue template with labels/milestones that can be used.)
  • DCPT03 preserves metadata on children (DCPT03 | List specific metadata that should be preserved.)
  • Bullets used for non-tasks in the 'Acceptance Criteria' section should be formatted as checkboxes.
  • The issue is generally well-structured but requires more clarity in tasks and acceptance criteria. Additionally, some tasks may be blocked due to agent limitations.

Related Issues/PRs

References

Blockers & Dependencies

  • The issue is generally well-structured but requires more clarity in tasks and acceptance criteria. Additionally, some tasks may be blocked due to agent limitations.

Context for Agent

Design Decisions & Constraints

  • Create a decomposable issue with labels/milestone (The agent cannot modify repository settings, which may include labels and milestones. | Provide a predefined issue template with labels/milestones that can be used.)
  • DCPT03 preserves metadata on children (DCPT03 | List specific metadata that should be preserved.)
  • Bullets used for non-tasks in the 'Acceptance Criteria' section should be formatted as checkboxes.
  • The issue is generally well-structured but requires more clarity in tasks and acceptance criteria. Additionally, some tasks may be blocked due to agent limitations.
  • | Keepalive | ✅ enabled |

Related Issues/PRs

References

Blockers & Dependencies

  • The issue is generally well-structured but requires more clarity in tasks and acceptance criteria. Additionally, some tasks may be blocked due to agent limitations.

Tasks

  • Create a large issue with 8+ subtasks (e.g., 'Build user dashboard with auth, profile, settings, notifications, themes, export, import, admin') in the test repo.
  • Create an atomic issue (e.g., 'Fix null check in parser') in the test repo.
  • Create a decomposable issue with labels/milestone in the test repo.
  • Create a large issue with 8+ subtasks (e.g., 'Build user dashboard with auth, profile, settings, notifications, themes, export, import, admin') in the test repo.
  • Create an atomic issue (e.g., 'Fix null check in parser') in the test repo.
  • Create a decomposable issue with labels/milestone in the test repo.

Acceptance criteria

  • DCPT01 creates linked child issues.
  • DCPT02 leaves atomic issues alone.
  • DCPT03 preserves metadata on children.
  • DCPT01 creates linked child issues.
  • DCPT02 leaves atomic issues alone.
  • DCPT03 preserves metadata on children.
  • DCPT01 creates linked child issues
  • DCPT02 leaves atomic issues alone
  • DCPT03 preserves metadata on children
  • ### Test Repo
  • Run tests in Manager-Database or another consumer repo.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Jan 9, 2026

🤖 Keepalive Loop Status

PR #702 | Agent: Codex | Iteration 0/5

Current State

Metric Value
Iteration progress [----------] 0/5
Action wait (missing-agent-label)
Disposition skipped (transient)
Gate success
Tasks 17/17 complete
Keepalive ❌ disabled
Autofix ❌ disabled

🔍 Failure Classification

| Error type | infrastructure |
| Error category | resource |
| Suggested recovery | Confirm the referenced resource exists (repo, PR, branch, workflow, or file). |

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR modifies the keepalive loop to automatically bypass Gate workflow cancellations that are caused solely by GitHub API rate limits, allowing work to continue immediately instead of deferring to a scheduled retry. The change treats rate limits as temporary infrastructure constraints rather than code quality issues.

  • Adds automatic bypass logic for rate-limit-only Gate cancellations
  • Introduces new action reason bypass-rate-limit-gate for tracking
  • Updates decision flow to prioritize rate limit bypass over the existing forceRetry mechanism

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +1134 to 1141
if (gateRateLimit && tasksRemaining) {
action = 'run';
reason = 'bypass-rate-limit-gate';
if (core) core.info('Gate cancelled due to rate limits only - proceeding with work (rate limits are not code quality issues)');
} else if (forceRetry && tasksRemaining) {
action = 'run';
reason = 'force-retry-cancelled';
if (core) core.info(`Force retry enabled: bypassing cancelled gate (rate_limit=${gateRateLimit})`);
Copy link

Copilot AI Jan 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new rate limit bypass logic takes precedence over the forceRetry flag. When both gateRateLimit and forceRetry are true, this code will return action='run' with reason='bypass-rate-limit-gate' instead of reason='force-retry-cancelled'. This changes the behavior for existing test case at line 493-517 in keepalive-loop.test.js which expects reason='force-retry-cancelled' when forceRetry is enabled with a rate-limited cancellation.

Consider checking forceRetry first (swap the order of these two conditions) to preserve the existing forceRetry behavior and maintain backward compatibility with the existing test expectations.

Suggested change
if (gateRateLimit && tasksRemaining) {
action = 'run';
reason = 'bypass-rate-limit-gate';
if (core) core.info('Gate cancelled due to rate limits only - proceeding with work (rate limits are not code quality issues)');
} else if (forceRetry && tasksRemaining) {
action = 'run';
reason = 'force-retry-cancelled';
if (core) core.info(`Force retry enabled: bypassing cancelled gate (rate_limit=${gateRateLimit})`);
if (forceRetry && tasksRemaining) {
action = 'run';
reason = 'force-retry-cancelled';
if (core) core.info(`Force retry enabled: bypassing cancelled gate (rate_limit=${gateRateLimit})`);
} else if (gateRateLimit && tasksRemaining) {
action = 'run';
reason = 'bypass-rate-limit-gate';
if (core) core.info('Gate cancelled due to rate limits only - proceeding with work (rate limits are not code quality issues)');

Copilot uses AI. Check for mistakes.
Comment on lines +1134 to +1137
if (gateRateLimit && tasksRemaining) {
action = 'run';
reason = 'bypass-rate-limit-gate';
if (core) core.info('Gate cancelled due to rate limits only - proceeding with work (rate limits are not code quality issues)');
Copy link

Copilot AI Jan 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change will cause existing tests to fail. The tests at lines 423-469 in keepalive-loop.test.js expect action='defer' and reason='gate-cancelled-rate-limit' when rate limit cancellations are detected. With this new code, when tasksRemaining is true, the action will be 'run' and reason will be 'bypass-rate-limit-gate' instead. The tests need to be updated to reflect this new behavior, or new tests should be added to verify the bypass logic works as intended.

Copilot uses AI. Check for mistakes.
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
…704)

The reusable CI workflow had a bug where it assumed dev tools (black,
ruff, mypy, pytest, etc.) were included in consumer repos' lock files.
This caused CI failures with 'black: command not found' errors.

Root cause: When has_lock_file=true, the workflow only recorded tools
as 'from lock' for reporting but didn't actually install them. Consumer
repos' lock files only contain runtime dependencies, not dev tools.

This fix:
- Always installs dev tools (black, ruff, mypy, pytest, etc.)
- Removes the has_lock_file conditional for tool installation
- Lock files still work for runtime dependencies
- Affects all 4 CI jobs: lint-format, lint-ruff, typecheck-mypy, tests

Impact: Fixes CI failures in Travel-Plan-Permission, Template,
trip-planner, Collab-Admin and all other consumer repos with lock files.
Tests now expect action='run' with reason='bypass-rate-limit-gate' instead
of action='defer' with reason='gate-cancelled-rate-limit'.

Rate limits are infrastructure noise, not code quality issues. Work
should proceed automatically when Gate cancellation is due to rate limits.

Rate limit bypass takes precedence over forceRetry since:
1. Rate limit bypass is automatic infrastructure handling
2. forceRetry is still honored for non-rate-limit cases (cancelled, failed)
@stranske stranske temporarily deployed to agent-high-privilege January 9, 2026 17:19 — with GitHub Actions Inactive
Aligns with JS test updates - rate limits are infrastructure noise
that should be bypassed immediately rather than causing deferrals.
@stranske stranske temporarily deployed to agent-high-privilege January 9, 2026 17:23 — with GitHub Actions Inactive
@github-actions github-actions bot added the autofix Opt-in automated formatting & lint remediation label Jan 9, 2026
@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Jan 9, 2026

Status | ✅ no new diagnostics
History points | 1
Timestamp | 2026-01-09 17:24:04 UTC
Report artifact | autofix-report-pr-702
Remaining | 0
New | 0
No additional artifacts

@stranske stranske merged commit 374ce2f into main Jan 9, 2026
128 checks passed
@stranske stranske deleted the fix/bypass-rate-limit-gate branch January 9, 2026 17:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

autofix Opt-in automated formatting & lint remediation

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants