Skip to content

test: replace test placeholders with real subsystem wiring#1845

Merged
Aureliolo merged 5 commits into
mainfrom
chore/issue-1837-test-placeholders
May 9, 2026
Merged

test: replace test placeholders with real subsystem wiring#1845
Aureliolo merged 5 commits into
mainfrom
chore/issue-1837-test-placeholders

Conversation

@Aureliolo
Copy link
Copy Markdown
Owner

Closes #1837.

Two tests carried pytest.skip / model_copy workarounds with TODO comments. Both now exercise their real subsystems.

What changed

Sub-task A — real LLM provider e2e (tests/e2e/test_single_agent_e2e.py)

TestRealLLMIntegration.test_real_provider_text_completion no longer unconditionally calls pytest.skip("Real LLM provider integration not yet wired"). It now constructs a real LiteLLMDriver with a minimal ProviderConfig and runs AgentEngine.run end-to-end against the configured model.

Env-var gates (skip if any missing, CI-safe by default):

Var Purpose
REAL_LLM_TEST=1 Class-level master switch (preserved)
REAL_LLM_MODEL Model id (e.g. llama3.2:1b, gpt-5-mini)
REAL_LLM_PROVIDER LiteLLM routing key (openai, anthropic, ollama, …)
REAL_LLM_API_KEY Provider API key — required for hosted providers
REAL_LLM_BASE_URL Endpoint URL — required for local providers (Ollama, LM Studio, vLLM)

At least one of REAL_LLM_API_KEY or REAL_LLM_BASE_URL must be set; both unset still skips cleanly. Empty-string env vars are normalised to None so ProviderConfig.NotBlankStr validation accepts them.

Sub-task B — ApprovalStore wiring in meta-cycle test (tests/integration/meta/test_meta_cycle.py)

test_proposal_rollout_succeeds no longer mutates the proposal via proposal.model_copy(update={"status": APPROVED, ...}) in isolation. The approval round-trips through the real ApprovalStore:

  1. run_cycle triggers ApprovalGateGuard.evaluate, which registers an ApprovalItem with deterministic id uuid5(NAMESPACE_URL, "proposal:<id>").
  2. The test fetches that item via approval_store.get(approval_id) and asserts is not None + status == PENDING.
  3. The test approves via approval_store.save_if_pending(...) — same first-writer-wins path as synthorg.api.controllers.approvals._decide and synthorg.meta.mcp.handlers.approvals._decide.
  4. The proposal handed to execute_rollout carries decision metadata (decided_at / decided_by / decision_reason) sourced from the store-resident ApprovalItem, not from a free-standing test mutation.

A regression in the guard (different id derivation, failure to register, store concurrency bug) now surfaces as a failed assert item is not None rather than a silently-bypassed flow. Uses in-memory ApprovalStore() (no persistence wiring) — persistence parity is covered by tests/conformance/persistence/.

Test plan

  • uv run ruff check src/ tests/ --fix — clean
  • uv run ruff format src/ tests/ — clean (3689 unchanged)
  • uv run mypy src/ tests/Success: no issues found in 3689 source files
  • uv run python -m pytest tests/integration/meta/ tests/e2e/ tests/unit/api/ -n 03253 passed, 4 skipped (one xdist scheduler crash on the full unit suite was unrelated worker churn — affected-module pre-push gate stayed clean)
  • Real LLM smoke test ran end-to-end against a local Ollama with llama3.2:1b (REAL_LLM_TEST=1, REAL_LLM_PROVIDER=ollama, REAL_LLM_BASE_URL=http://localhost:11434, no API key) — passed in 31s on the first run, 10s on the warm re-run after a docstring tweak.
  • CI-safe skip path verified: with all env vars unset and just REAL_LLM_TEST=1 set, the test skips with the precise method-level reason instead of running.

Review coverage

  • Code-reviewer pass on the initial two commits returned no findings.
  • Code-reviewer pass on the local-provider extension (commit 3) flagged a misleading docstring claim ("API key wins") — fixed in commit 4 with an accurate description (api_key for auth, base_url as endpoint, both forwarded together).

Commits

  1. 137ff3f — wire real LiteLLMDriver in single-agent e2e test
  2. 5694937 — route meta-cycle proposal approval through real ApprovalStore
  3. 8bcd1b5 — support local LLM providers (api_key optional, REAL_LLM_BASE_URL)
  4. ca6ab9a — clarify auth-config docstring + comment in real-LLM test

Aureliolo added 4 commits May 9, 2026 21:05
Make REAL_LLM_API_KEY optional so the test can target local
providers (Ollama, LM Studio, vLLM) reached via REAL_LLM_BASE_URL.
At least one of the two must be set; both unset still skips
cleanly with a documented reason.
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 9, 2026

Dependency Review

✅ No vulnerabilities or license issues or OpenSSF Scorecard issues found.

Scanned Files

None

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 9, 2026

Review Change Stack
No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 776c5367-f059-4021-b102-289c13441d94

📥 Commits

Reviewing files that changed from the base of the PR and between ca6ab9a and e64d1df.

📒 Files selected for processing (2)
  • tests/e2e/test_single_agent_e2e.py
  • tests/integration/meta/test_meta_cycle.py
📜 Recent review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: Test (Python 3.14)
  • GitHub Check: Analyze (python)
  • GitHub Check: Analyze (javascript-typescript)
🧰 Additional context used
📓 Path-based instructions (2)
tests/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

tests/**/*.py: Test markers: use @pytest.mark.{unit,integration,e2e,slow}. Async tests use auto. Global timeout 30s. Coverage 80% minimum.
xdist: use -n 8 --dist=loadfile (auto-applied via pyproject addopts; loadfile prevents 3.14+ Windows ProactorEventLoop leak).
Windows: unit tests use WindowsSelectorEventLoopPolicy (3.14 IOCP teardown race). Subprocess tests override back.
Mock-spec: every Mock declares spec=ConcreteClass; baseline at scripts/mock_spec_baseline.txt.
FakeClock: import from tests._shared.fake_clock; inject via clock= parameter.
Hypothesis: 10 deterministic CI examples; failures are real bugs (fix + add @example(...)). Never skip/xfail for flaky tests; fix fundamentally.
Use asyncio.Event().wait() instead of sleep(large) to avoid flaky tests.

Files:

  • tests/integration/meta/test_meta_cycle.py
  • tests/e2e/test_single_agent_e2e.py

⚙️ CodeRabbit configuration file

Test files do not require Google-style docstrings on classes or functions -- ruff D rules are only enforced on src/. A bare @settings() decorator with no arguments on Hypothesis property tests is a no-op and should not be suggested -- the HYPOTHESIS_PROFILE env var controls example counts via registered profiles, which @given() honors automatically.

Files:

  • tests/integration/meta/test_meta_cycle.py
  • tests/e2e/test_single_agent_e2e.py
{src,tests}/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Vendor-agnostic naming: NEVER use real vendor names in project code/tests. Use example-provider, test-provider, example-{large,medium,small}-001. Allowed only in .claude/, third-party imports, providers/presets.py, web/public/provider-logos/.

Files:

  • tests/integration/meta/test_meta_cycle.py
  • tests/e2e/test_single_agent_e2e.py
🧠 Learnings (1)
📚 Learning: 2026-05-05T09:04:46.195Z
Learnt from: Aureliolo
Repo: Aureliolo/synthorg PR: 1760
File: scripts/_dual_backend_parity_lib.py:215-216
Timestamp: 2026-05-05T09:04:46.195Z
Learning: This repository targets Python 3.14+ and follows PEP 758. Therefore, reviewer tooling should NOT treat unparenthesized multi-exception `except` clauses written without an `as` clause (e.g., `except MemoryError, RecursionError:`) as syntax errors. Only flag `except`-clause problems when they are genuinely invalid for Python 3.14+.

Applied to files:

  • tests/integration/meta/test_meta_cycle.py
  • tests/e2e/test_single_agent_e2e.py
🔇 Additional comments (6)
tests/e2e/test_single_agent_e2e.py (3)

17-22: LGTM!

Import additions are appropriate for constructing the real provider configuration and driver.


386-392: LGTM!

Docstring correctly uses vendor-agnostic phrasing ("configured provider driver", "local / self-hosted providers"), addressing the prior review feedback.


395-461: LGTM!

The test implementation correctly:

  • Normalizes empty-string env vars to None (lines 412-413).
  • Requires at least one of api_key or base_url (lines 414-418).
  • Constructs a minimal ProviderConfig and LiteLLMDriver.
  • Uses defensive assertions (>= 0 for cost to accommodate zero-cost local providers).
  • Ties cost record count to actual turns completed.

The vendor-agnostic wording in the docstring and skip messages is compliant with coding guidelines.

tests/integration/meta/test_meta_cycle.py (3)

7-8: LGTM!

Import additions support deterministic approval ID derivation (uuid5, NAMESPACE_URL) and typed approval handling (ApprovalStatus, NotBlankStr).

Also applies to: 12-13


164-195: LGTM!

The approval flow correctly:

  • Uses clock.now() from the shared FakeClock for deterministic timestamps.
  • Exercises save_if_pending (the same first-writer-wins path as API/MCP handlers).
  • Sources decision metadata from the store-resident ApprovalItem for the proposal.
  • Clearly documents why the proposal model_copy is mechanical (no adapter exists).

This ensures regressions in the store's concurrency model or approval registration surface as explicit test failures.


150-156: The uuid5 derivation pattern is correct and matches the guard implementation.

The test's uuid5(NAMESPACE_URL, f"proposal:{proposal.id}") is equivalent to the guard's uuid5(NAMESPACE_URL, f"proposal:{str(proposal.id)}") (guard line 101 converts proposal.id to string). Both produce identical hashes, so the test will catch any regression in the guard's approval ID generation as intended.


Walkthrough

Two integration test placeholders are converted to real subsystem flows. The e2e test's real LLM provider integration now reads provider configuration from environment variables, instantiates LiteLLMDriver, runs AgentEngine, and validates cost tracking and termination behavior. The meta-cycle test's approval flow is rewired to use the real ApprovalStore: it computes a deterministic approval id, fetches the pending approval item, approves it via save_if_pending, and sources the decision metadata from the stored approval before executing the rollout, eliminating the prior direct proposal mutation approach.

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately summarizes the primary change: replacing test placeholders with real subsystem wiring, which matches the changeset's core objective.
Description check ✅ Passed The description is comprehensive and directly related to the changeset, detailing the two sub-tasks, environment variables, test execution results, and commit structure.
Linked Issues check ✅ Passed All objectives from issue #1837 are met: the real-LLM provider e2e test now exercises the actual provider integration path with environment-variable gating [#1837], and the meta-cycle test routes approval through the real ApprovalStore instead of isolated model_copy mutation [#1837].
Out of Scope Changes check ✅ Passed All changes are scoped to the two target test files and directly address the removal of placeholders and TODO comments specified in issue #1837; no extraneous changes detected.
Docstring Coverage ✅ Passed Docstring coverage is 80.00% which is sufficient. The required threshold is 40.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request implements end-to-end smoke tests for real LLM providers using LiteLLMDriver and enhances integration tests for the SelfImprovementService by integrating a real ApprovalStore workflow. Feedback focuses on improving test consistency by sharing a FakeClock instance between components to ensure synchronized time-dependent logic and timestamps.

Comment on lines 131 to 140
approval_store = ApprovalStore()
svc = SelfImprovementService(
config=SelfImprovementConfig(
enabled=True,
config_tuning_enabled=True,
),
clock=FakeClock(),
snapshot_builder=snapshot_builder,
approval_store=ApprovalStore(),
approval_store=approval_store,
)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The ApprovalStore and SelfImprovementService are initialized with different clock sources. While SelfImprovementService is explicitly given a FakeClock, ApprovalStore defaults to SystemClock. For integration tests using a fake clock, it is safer and more consistent to share the same clock instance across all components that support it. This ensures that any time-dependent logic (like expiration checks in the store) remains synchronized with the test's simulated time.

Suggested change
approval_store = ApprovalStore()
svc = SelfImprovementService(
config=SelfImprovementConfig(
enabled=True,
config_tuning_enabled=True,
),
clock=FakeClock(),
snapshot_builder=snapshot_builder,
approval_store=ApprovalStore(),
approval_store=approval_store,
)
clock = FakeClock()
approval_store = ApprovalStore(clock=clock)
svc = SelfImprovementService(
config=SelfImprovementConfig(
enabled=True,
config_tuning_enabled=True,
),
clock=clock,
snapshot_builder=snapshot_builder,
approval_store=approval_store,
)

# first-writer-wins path the API and MCP approval handlers
# take, so a regression in the store's concurrency model also
# surfaces here.
decided_at = datetime.now(UTC)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Since the test is being updated to use a shared clock variable (per the suggestion for lines 131-140), the decision timestamp should also be sourced from that clock to maintain consistency within the test scenario.

Suggested change
decided_at = datetime.now(UTC)
decided_at = clock.now()

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@tests/e2e/test_single_agent_e2e.py`:
- Around line 386-392: The test docstrings/messages currently use vendor-branded
names ("LiteLLM", "Ollama", "LM Studio", "vLLM"); replace those with
vendor-agnostic phrases such as "configured provider driver", "local provider
endpoint", or "local/self-hosted provider" in the text that explains
REAL_LLM_MODEL, REAL_LLM_PROVIDER, REAL_LLM_API_KEY, and REAL_LLM_BASE_URL and
in the other occurrences near those same docstrings (the nearby strings
referenced in the comment). Ensure the meaning stays the same (auth via api_key
and endpoint via base_url) while removing specific vendor names.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 6e660204-71a0-4911-8075-1b3897b9daa9

📥 Commits

Reviewing files that changed from the base of the PR and between 099b871 and ca6ab9a.

📒 Files selected for processing (2)
  • tests/e2e/test_single_agent_e2e.py
  • tests/integration/meta/test_meta_cycle.py
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
  • GitHub Check: Type Check
  • GitHub Check: Test (Python 3.14)
  • GitHub Check: Analyze (javascript-typescript)
  • GitHub Check: Analyze (go)
  • GitHub Check: Analyze (python)
🧰 Additional context used
📓 Path-based instructions (2)
tests/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

tests/**/*.py: Test markers: use @pytest.mark.{unit,integration,e2e,slow}. Async tests use auto. Global timeout 30s. Coverage 80% minimum.
xdist: use -n 8 --dist=loadfile (auto-applied via pyproject addopts; loadfile prevents 3.14+ Windows ProactorEventLoop leak).
Windows: unit tests use WindowsSelectorEventLoopPolicy (3.14 IOCP teardown race). Subprocess tests override back.
Mock-spec: every Mock declares spec=ConcreteClass; baseline at scripts/mock_spec_baseline.txt.
FakeClock: import from tests._shared.fake_clock; inject via clock= parameter.
Hypothesis: 10 deterministic CI examples; failures are real bugs (fix + add @example(...)). Never skip/xfail for flaky tests; fix fundamentally.
Use asyncio.Event().wait() instead of sleep(large) to avoid flaky tests.

Files:

  • tests/e2e/test_single_agent_e2e.py
  • tests/integration/meta/test_meta_cycle.py

⚙️ CodeRabbit configuration file

Test files do not require Google-style docstrings on classes or functions -- ruff D rules are only enforced on src/. A bare @settings() decorator with no arguments on Hypothesis property tests is a no-op and should not be suggested -- the HYPOTHESIS_PROFILE env var controls example counts via registered profiles, which @given() honors automatically.

Files:

  • tests/e2e/test_single_agent_e2e.py
  • tests/integration/meta/test_meta_cycle.py
{src,tests}/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Vendor-agnostic naming: NEVER use real vendor names in project code/tests. Use example-provider, test-provider, example-{large,medium,small}-001. Allowed only in .claude/, third-party imports, providers/presets.py, web/public/provider-logos/.

Files:

  • tests/e2e/test_single_agent_e2e.py
  • tests/integration/meta/test_meta_cycle.py
🧠 Learnings (1)
📚 Learning: 2026-05-05T09:04:46.195Z
Learnt from: Aureliolo
Repo: Aureliolo/synthorg PR: 1760
File: scripts/_dual_backend_parity_lib.py:215-216
Timestamp: 2026-05-05T09:04:46.195Z
Learning: This repository targets Python 3.14+ and follows PEP 758. Therefore, reviewer tooling should NOT treat unparenthesized multi-exception `except` clauses written without an `as` clause (e.g., `except MemoryError, RecursionError:`) as syntax errors. Only flag `except`-clause problems when they are genuinely invalid for Python 3.14+.

Applied to files:

  • tests/e2e/test_single_agent_e2e.py
  • tests/integration/meta/test_meta_cycle.py
🔇 Additional comments (1)
tests/integration/meta/test_meta_cycle.py (1)

131-195: Nice integration hardening of the approval path.

This now validates the real ApprovalStore lifecycle (register → fetch pending → save_if_pending → rollout with persisted decision metadata), which gives strong regression signal for guard/store coupling.

Comment thread tests/e2e/test_single_agent_e2e.py Outdated
@codecov
Copy link
Copy Markdown

codecov Bot commented May 9, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 84.78%. Comparing base (099b871) to head (e64d1df).
✅ All tests successful. No failed tests found.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1845      +/-   ##
==========================================
- Coverage   84.78%   84.78%   -0.01%     
==========================================
  Files        1800     1800              
  Lines      104864   104864              
  Branches     9202     9202              
==========================================
- Hits        88910    88904       -6     
- Misses      13716    13722       +6     
  Partials     2238     2238              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

- tests/integration/meta/test_meta_cycle.py: share a single FakeClock
  between ApprovalStore and SelfImprovementService and source
  decided_at from clock.now() so time-dependent paths in the store
  (expiration, decision timestamps) stay in lockstep with the
  service's simulated time (Gemini #3213807073, #3213807075).
- tests/e2e/test_single_agent_e2e.py: drop vendor-branded names
  (LiteLLM, Ollama, LM Studio, vLLM) from docstrings and skip
  messages in TestRealLLMIntegration; use vendor-agnostic phrasing
  (configured provider driver, local / self-hosted providers,
  provider routing key) per CLAUDE.md vendor-agnostic naming
  (CodeRabbit #3213808360). The LiteLLMDriver class identifier is
  retained: it is the project-internal wrapper around the
  third-party litellm package, which CLAUDE.md exempts under
  third-party imports.
@Aureliolo Aureliolo merged commit ddbb666 into main May 9, 2026
43 checks passed
@Aureliolo Aureliolo deleted the chore/issue-1837-test-placeholders branch May 9, 2026 20:23
Aureliolo added a commit that referenced this pull request May 9, 2026
## Summary

`scripts/check_setting_to_startup_trace.py` froze 8 ghost-wired
settings: 7 × `backup.*` and `security.timeout_check_interval_seconds`.
Both owning services were either factory-gated to `None` in default
config (`BackupService`) or bootstrapped with a hardcoded interval and
never re-read at runtime (`ApprovalTimeoutScheduler`). The registered
settings had no live consumer at boot.

This branch wires both services through the canonical DB > env > YAML >
default chain and regenerates the baseline to header-only.

## Changes

**Phase 1 — `BackupService`**
- Drop the `if not backup_config.enabled: return None` early return in
`build_backup_service`. The factory always returns a real
`BackupService`. `BackupService.start()` already gates internally on
`self._config.enabled`, so a deployment with `backup.enabled=false`
still avoids scheduler ticks. Adds an INFO log on the dormant path so
operators can confirm the deliberate state from the boot log.

**Phase 2 — `ApprovalTimeoutScheduler`**
- New `_apply_security_timeout_interval` helper in
`lifecycle_helpers.py`, called after `_apply_bridge_config`, resolves
`security.timeout_check_interval_seconds` via `ConfigResolver` and calls
`scheduler.reschedule(...)`.
- New `SecurityTimeoutSettingsSubscriber` (mirrors
`BackupSettingsSubscriber._reschedule`) handles live operator changes
via the dispatcher.
- `_build_settings_dispatcher` + `BuildDispatcherFn` Protocol +
`auto_wire_settings` extended with the new optional
`approval_timeout_scheduler` parameter; scheduler construction in
`app.py` moved earlier so the dispatcher can wire the matching
subscriber against the same instance.

**Phase 3 — Baseline**
- `scripts/setting_to_startup_trace_baseline.txt` regenerated to
header-only (`uv run python scripts/check_setting_to_startup_trace.py
--update-baseline`).
- Smoke test `test_real_repo_violations_match_expected` updated to
assert empty-violation set as the new ground truth.
- `scripts/loop_bound_init_baseline.txt` line 74 → 75 to track the
shifted import in `backup/service.py`.

**Pre-PR review fixes (commit 4)**
- Drop now-stale ghost-service examples in
`docs/reference/configuration-precedence.md`; replace with a generic
"fixing a ghost-wired service" guide referencing both as end-to-end
examples.
- Rewrite two app.py comments to explain the current architectural
reason instead of framing the change relative to the prior site
(CLAUDE.md no-migration-framing).
- Bind exception in `build_backup_service` failure path; log via
`safe_error_description` + `error_type` per SEC-1 redaction.
- Bind float-parse failure in `SecurityTimeoutSettingsSubscriber`; add
`error_type` so operators can distinguish a non-numeric value from a
None value.
- Document `BACKUP_SCHEDULER_STARTED` as the started-branch's observable
signal in `BackupService.start`'s docstring.

## Test Plan

- 16 new tests across 4 files: factory (3), helper (5), subscriber (8);
plus updated lint smoke test.
- Full unit suite: 29,463 passed (50 skipped Windows / opt-in markers).
- Focused integration: 55 passed (3 skipped — Docker unavailable).
- Gates clean: `check_setting_to_startup_trace.py`,
`check_no_loop_bound_init.py`, `check_logger_exception_str_exc.py`,
`check_no_magic_numbers.py`, `check_mock_spec.py`.
- ruff + mypy strict + ruff format: clean.

## Review coverage

`/pre-pr-review` ran 17 specialized agents in parallel (code-reviewer,
python-reviewer, pr-test-analyzer, silent-failure-hunter,
type-design-analyzer, logging-audit, resilience-audit,
conventions-enforcer, security-reviewer, test-quality-reviewer,
async-concurrency-reviewer, comment-analyzer, docs-consistency,
comment-quality-rot, api-contract-drift, plus 5 mini-pass agents from
the codebase audit: missing-logger, missing-event-constants,
missing-state-transition-log, unwired-settings, race-conditions). 6
valid findings retained and fixed; 1 false-positive cluster (PEP 758
`except A, B:` flagged as Python 2 syntax) discarded with rationale
(CLAUDE.md explicitly permits this form).

## Notes

- No PR-tracked GitHub issue (per /pre-pr-review prompt: user chose to
proceed without).
- Rebased onto `origin/main` to pull in `#1845` (unrelated test
refactor) before pushing.
Aureliolo pushed a commit that referenced this pull request May 10, 2026
<!-- HIGHLIGHTS_START -->
## Highlights

> _AI-generated summary (model: `openai/gpt-4.1-mini` via GitHub
Models). Commit-based changelog below._

### What you'll notice
- Improved error logging and Prometheus instrumentation provide better
system monitoring.
- Eliminated race conditions in CI tagging for more reliable development
releases.
- Fixed critical configuration access and kill-switch bugs to enhance
system stability.
- Enhanced client experience with retry-after headers and better
websocket reconnect behavior.

### What's new
- Introduced composite indexes and cursor pagination for faster data
queries.
- Added server-sent events rate limiting and Ollama input sanitization
for improved security.

### Under the hood
- Centralized workflow error mappings to standardize error handling.
- Refactored API lifecycle fallback to use a configuration snapshot for
consistency.
- Tightened startup settings baseline and reduced controller error
baseline to zero.
- Replaced flaky contributor-assistant GitHub action with a custom
stable step.
- Consolidated Renovate dependency groups to avoid update conflicts.
- Upgraded in-toto-golang dependency to fix security vulnerabilities and
dropped unnecessary CVE waivers.
- Extensive lock file maintenance and multiple infrastructure and Python
dependency updates.

<!-- HIGHLIGHTS_END -->

:robot: I have created a release *beep* *boop*
---


##
[0.8.2](v0.8.1...v0.8.2)
(2026-05-10)


### Features

* close audit gaps in error logging and Prometheus instrumentation
([#1821](#1821))
([ef00fdc](ef00fdc))


### Bug Fixes

* **ci:** eliminate dev-release tag-vs-downstream race + CI hygiene
audit ([#1827](#1827))
([b7b9a59](b7b9a59))
* **config:** close 6 settings reachability + kill-switch gaps
([#1798](#1798))
([410cb3b](410cb3b))
* correctness / safety fixes from 2026-05-05 audit (Wave 28)
([#1823](#1823))
([d01e624](d01e624))


### Performance

* composite indexes + cursor pagination + clock seam + SSE rate-limit +
Ollama sanitization + retry-after web client + WS reconnect jitter
([#1822](#1822))
([d1faf86](d1faf86))


### Refactoring

* **api:** move activities lifecycle-cap fallback to ApiBridgeConfig
snapshot ([#1840](#1840))
([7a56e9c](7a56e9c))
* centralise workflow error mapping and shared error codes
([#1778](#1778) sub-tasks A
+ E) ([#1843](#1843))
([11132cd](11132cd))
* drive controller-error baseline to zero
([#1778](#1778) sub-task A
tail) ([#1846](#1846))
([e96ae20](e96ae20))
* slim CLAUDE.md, port pr-review-toolkit agents, sync .opencode parity
([#1833](#1833))
([e6372b8](e6372b8))
* tighten settings → startup-trace baseline (8 → 0)
([#1847](#1847))
([3376ee2](3376ee2))


### Documentation

* fix CLAUDE.md inaccuracies and drop drift-prone counts
([#1844](#1844))
([371925f](371925f))


### Tests

* replace test placeholders with real subsystem wiring
([#1845](#1845))
([ddbb666](ddbb666))


### CI/CD

* **cla:** replace flaky contributor-assistant action with custom
read-path step
([#1819](#1819))
([11aeafe](11aeafe))
* tidy dev-release notes + stagger renovate lockfile day
([#1824](#1824))
([ec746a9](ec746a9))


### Maintenance

* cleanup roundup, sub-tasks a/c/d/g/h/j/l/m of
[#1781](#1781)
([#1838](#1838))
([099b871](099b871))
* close remaining 5 sub-tasks of
[#1781](#1781) (b/e/f/i/k)
([#1852](#1852))
([59cf0b2](59cf0b2))
* collapse Renovate dep groups into Python / Web / Infrastructure to
remove cross-PR overlap
([#1813](#1813))
([4cbd857](4cbd857))
* **deps,security:** bump in-toto-golang v0.11.0 + drop two patched CVE
waivers ([#1851](#1851))
([0b8b5bb](0b8b5bb))
* disable Renovate vulnerabilityAlerts so security flows into normal
updates ([#1834](#1834))
([6b7d15f](6b7d15f))
* Lock file maintenance
([#1820](#1820))
([ccbad73](ccbad73))
* Lock file maintenance
([#1842](#1842))
([13b68a5](13b68a5))
* Lock file maintenance
([#1853](#1853))
([db6650b](db6650b))
* Update dhi.io/nats:2.14-debian13 Docker digest to eb768bf
([#1841](#1841))
([37f84fc](37f84fc))
* Update Infrastructure dependencies
([#1815](#1815))
([75b12fe](75b12fe))
* Update Infrastructure dependencies
([#1831](#1831))
([3f3c50b](3f3c50b))
* Update Python dependencies
([#1817](#1817))
([e11332f](e11332f))
* Update Python dependencies
([#1832](#1832))
([4515c8e](4515c8e))

---
This PR was generated with [Release
Please](https://github.com/googleapis/release-please). See
[documentation](https://github.com/googleapis/release-please#release-please).

---------

Co-authored-by: synthorg-repo-bot[bot] <279117679+synthorg-repo-bot[bot]@users.noreply.github.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[CLEANUP] Wire test placeholders: real-LLM provider + ApprovalStore

1 participant