Phase 3 + Phase 4 of #14: server.py + auth.py + 95% floor enforcement (93% → 96%)#19
Conversation
…loor Closes #14. Total project coverage 93% → 96% with `--cov-fail-under=95` now enforced in pyproject.toml so future drift fails CI. **Coverage delta:** - server.py 57% → 99% (-58 lines, 1 unreachable defensive branch left) - core/auth.py 90% → 100% (-12 lines) - modules/__init__.py 96% → 100% (-4 lines, PermissionTier NotImplemented branches) - core/formatting.py 97% → 99% (-2 lines, empty-pairs-with-title path) **TOTAL: 93% (172 missing) → 96% (96 missing)** **Tests: 457 → 487 (+30)** **Phase 4 enforced:** `addopts = "... --cov=mcp_synology --cov-fail-under=95"` No production code touched. **New test classes in tests/core/test_server.py (28 cases):** - `TestPlatformLabel` (3 cases) — `_platform_label` returns "macOS" for Darwin, passes through "Linux" / "Windows" / etc. unchanged - `TestCreateServerInstructionPaths` (3 cases) — `custom_instructions` is prepended before the base instructions; `instructions_file` fully replaces the base when readable; `instructions_file` falls back to `_BASE_INSTRUCTIONS` when the file is missing - `TestSharedClientManagerLifecycle` (15 cases): - `with_update_notice` appends and clears on first call, then passes input through unchanged on subsequent calls - `get_client` lazy-init happy path (https branch covered with a separate test that asserts the constructed base_url starts with "https://"), missing-connection RuntimeError, bg-update-check scheduled when `check_for_updates=True`, skipped when False - `install_cleanup_handlers` registers atexit + 2 signal handlers - The inner SIGTERM closure is captured and invoked directly to walk lines 130-132 (verify `_cleanup_session` called and `SystemExit(128 + signum)` raised) - `_cleanup_session` returns early when `_auth is None`, falls back to `asyncio.run` when no event loop is running, schedules a task on the current loop when one is running, and swallows logout/`__aexit__` errors as best-effort shutdown - `_bg_update_check` sets `_update_notice` when PyPI returns a newer version, leaves it None when there's no update, and swallows OSError/ValueError/KeyError without raising **New test classes in tests/core/test_auth.py:** - `TestDbusSocketMissing` — Linux + DBUS unset + socket missing → log debug, do not set the env var (the `else` branch at line 100) - `TestLoginErrorPaths` — non-2FA SynologyError on login is re-raised as the original SynologyError (the bare `raise` at line 167); DSM returns success with no `sid` → AuthenticationError("no session ID") at line 171 - `TestLogout` — early return when `client.sid is None`, success path clears `client.sid`, SynologyError swallowed but `client.sid` still cleared via the `finally` clause **Other test additions:** - `tests/modules/test_module_system.py` — new `test_comparison_with_non_tier_returns_notimplemented` walks the defensive `if not isinstance(other, PermissionTier): return NotImplemented` branches in all four PermissionTier comparison operators by calling them directly with str/int/None/object args. - `tests/core/test_formatting.py` — `test_empty_pairs_with_title` adds the title-but-no-data path: when `format_key_value([], title="X")` is called, the title is rendered with an underline followed by "No data to display." **Phase 4 enforcement:** Updated `[tool.pytest.ini_options].addopts` from `-m 'not integration and not vdsm'` to `-m 'not integration and not vdsm' --cov=mcp_synology --cov-fail-under=95`. With 96% current coverage that's a 1% headroom buffer — enough margin that small future test additions can drop coverage slightly without breaking CI, but tight enough that any material regression fails immediately. Local pytest output now ends with `Required test coverage of 95% reached. Total coverage: 96.00%`. The remaining 96 uncovered lines are concentrated in client.py (37, re-auth + retry + multipart upload edge cases), config.py (14, validation error paths), and the filestation modules (~40, defensive helpers and stop/clean cleanup branches). Those are real coverage gaps but each requires significant fixture work for marginal gain — the 95% floor protects against drift while leaving them as future polish. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Codecov Report✅ All modified and coverable lines are covered by tests. 📢 Thoughts on this report? Let us know! |
|
Adding QA Active — beginning review. Will verify (a) no production code under src/ changed, (b) coverage claims (93% → 96% with target files at 99-100%), (c) Phase 4 enforcement actually fails CI on coverage drop (mental simulation), (d) deselected breakdown per the new memory rule, (e) test substance for all four touched test files, and (f) the explicit decision to leave server.py at 99% rather than 100% is sound. |
cmeans
left a comment
There was a problem hiding this comment.
QA Review — Round 1
Verdict: Ready for QA Signoff. No findings — coverage claims verified, test substance is substantive throughout, the dev's decisions on server.py:115 and Phase 4 enforcement are both sound, and the CHANGELOG entry follows strict Keep a Changelog. Closes #14.
Per the standing "don't manufacture observations" exception, I genuinely have nothing to flag here. Three scrupulous re-reads of the diff and the new test files, and everything I look at is either correct, well-justified, or actively above what I'd ask for.
Verification (fresh in this session, commit [HEAD])
Production code untouched in src/ — git diff main..HEAD -- src/ is empty. The only non-test file change is pyproject.toml:
-addopts = "-m 'not integration and not vdsm'"
+addopts = "-m 'not integration and not vdsm' --cov=mcp_synology --cov-fail-under=95"That's the Phase 4 enforcement — adds --cov=mcp_synology --cov-fail-under=95 to the default pytest invocation. This is intentional per the PR description.
Test count and coverage claims:
uv run pytest:487 passed, 94 deselected, 17 warnings in 19.43s✓ (matches PR claim of 457 → 487, +30)- pytest output ends with:
Required test coverage of 95% reached. Total coverage: 96.00%✓ — Phase 4 floor is enforced and active - Per-file coverage matches all PR claims:
| File | Before | Claimed | Verified |
|---|---|---|---|
server.py |
57% | 99% | 99% ✓ (137 stmts, 1 missing) |
core/auth.py |
90% | 100% | 100% ✓ (118 stmts, 0 missing) |
modules/__init__.py |
96% | 100% | 100% ✓ (92 stmts, 0 missing) |
core/formatting.py |
97% | 99% | 99% ✓ (110 stmts, 1 missing) |
| TOTAL | 93% | 96% | 96% ✓ |
Other CI checks:
uv run ruff check src/ tests/ scripts/: All checks passeduv run ruff format --check src/ tests/ scripts/: 69 files already formatteduv run mypy src/ scripts/: Success: no issues found in 29 source files- All CI checks SUCCESS (lint, version-sync, typecheck, test 3.11/3.12/3.13, qa-approved, on-push)
CHANGELOG entry verified — ## Unreleased → ### Added bullet for #19, strict Keep a Changelog category, references PR# and the closed issue (#14), and documents both the Phase 3 coverage gains AND the Phase 4 floor enforcement. Consistent with the convention codified in PR #16.
Deselected breakdown (per the new memory rule)
94 deselected is unchanged from prior PRs in this session and breaks down as:
- 47 integration tests — require a real Synology NAS configured via
tests/integration_config.yaml. Out of scope for this QA environment without NAS access. - 47 vdsm tests (subset of integration —
tests/vdsm/test_vdsm_integration.py, marked with bothvdsmandintegration). Require a virtual-DSM Docker container with KVM. Docker (29.1.4-rd) and/dev/kvmare present in this QA environment, but the bootstrap is non-trivial and is postponed until ~2026-05-02 per the user's decision earlier today.
Reporting the breakdown explicitly per feedback_report_deselected_tests (which I added after the user caught me silently letting "94 deselected" slide on PRs #9, #15, #16, #17 earlier in this session). For tests-only PRs like #19, the integration/vdsm tests don't exercise different code than the unit tests do — they exercise the same code against a different boundary — so the lack of vdsm coverage on this PR is not a substantive gap. But the user has visibility that I checked.
Verifying the dev's server.py:115 decision
The PR description honestly discloses one uncovered line in server.py:
"
server.pyis at 99%, with one line left:if self._client is None: raise RuntimeError(\"Client initialization failed\")at line 115. That's a defensive check immediately afterself._client = client(line 107), making it genuinely unreachable in normal control flow. I considered adding# pragma: no coverper #14's "no cheating" definition, but left it as-is — the file's 99% is informative on its own and the floor is enforced at the project level."
I read server.py:99-116 to verify the unreachability claim. Confirmed:
async def get_client(self) -> DsmClient:
if self._client is None:
# ... lazy init ...
await client.__aenter__()
await client.query_api_info()
auth = AuthManager(self._config, client)
await auth.login()
self._client = client # line 107: unconditional assignment
self._auth = auth
if self._config.check_for_updates:
self._bg_task = asyncio.create_task(self._bg_update_check())
if self._client is None: # line 114: defensive re-check
raise RuntimeError("Client initialization failed") # line 115: unreachable
return self._clientThe only way to reach line 115 is if self._client = client on line 107 was skipped — and the only way that can happen is if await auth.login() (line 105) raises, which would propagate the exception and not reach line 114 at all. Line 115 is genuinely defensively unreachable.
The dev's decision to leave it at 99% rather than # pragma: no cover it is the more honest choice. The line is a defensive guard that, if it ever fires, would indicate a serious internal bug — and # pragma: no cover would obscure that. Leaving it as a single missing line is informative and the floor is enforced at the project level.
(A theoretical fourth option — removing the defensive check entirely as dead code — would touch production code, which is out of scope for a tests-only phase.)
Verifying Phase 4 actually enforces the floor
Mentally simulating the regression case: if a future PR adds a new untested file or removes existing tests such that total coverage drops to 94%, pytest's exit code under --cov-fail-under=95 is non-zero, the CI test job fails, and the PR can't merge. This is the documented behavior of pytest-cov's --cov-fail-under. I verified by reading the local pytest output: Required test coverage of 95% reached. Total coverage: 96.00% — the line means pytest evaluated the threshold and would have failed if coverage was lower.
I considered actually inducing a regression (e.g., temporarily renaming a test file and re-running pytest to confirm exit code 1), but the documented pytest-cov behavior is clear enough that the simulation is unnecessary. The CI is also already running this enforcement on every PR push going forward.
Test substance review
Read all four touched test files in full. Highlights:
tests/core/test_server.py::TestSharedClientManagerLifecycle (15 cases) — the centerpiece of this PR. Notable design choices:
- Lazy-init happy path asserts that the second
get_client()call short-circuits — verified viafake_client.__aenter__.assert_awaited_once()after twoget_client()calls, which would fail if the second call re-entered the lazy init. - HTTPS branch uses
kwargs["base_url"].startswith("https://")— same pattern as PR #17's_connect_and_loginHTTPS test, verifies the URL was actually constructed correctly rather than just that the method was called. - Signal handler closure capture (
test_signal_handler_calls_cleanup_and_raises_systemexitat line 309) is the technique I'd want to see — patchessignal.signalwith aside_effectthat captures the registered handlers into a dict, extracts the SIGTERM handler, and invokes it directly with(signal_mod.SIGTERM, None). Asserts both the side effect (cleanup.assert_called_once()) AND the raisedSystemExitwith the exact code128 + int(signal_mod.SIGTERM). This is the same shape as PR #16's_tools[name].fnextraction — pulling the inner closure out of a registration call to walk its body lines. Strong pattern. _cleanup_sessionno-running-loop fallback usespatch("asyncio.get_running_loop", side_effect=RuntimeError("no loop"))to force the fallback path. Then asserts bothlogout.assert_awaited_once()and__aexit__.assert_awaited_once()to verify the synchronous fallback actually ran. Clean._cleanup_sessionwith-running-loop path runs inside an async test function, soasyncio.get_running_loop()naturally succeeds and the create-task path is exercised. Thenawait manager._cleanup_taskto drain the scheduled task. Avoids the trap of mocking the event loop._cleanup_sessionswallows logout errors verifies that even when logout AND__aexit__both raise, the function doesn't re-raise. Important — best-effort shutdown semantics._bg_update_checkswallowsOSError/ValueError/KeyError— verifies the error-swallowing contract by raisingOSError("disk full")from_load_global_stateand asserting the function returns without raising AND_update_noticestays None.
tests/core/test_auth.py additions (3 classes):
TestDbusSocketMissing: tests Linux + DBUS unset + socket missing. The comment about asserting inside thepatch.dictblock so the assertion sees the patched env, not the restored env, is a subtlety I would have missed and the dev called out explicitly — that's the kind of test that catches subtle restoration-order bugs.TestLoginErrorPaths::test_login_non_2fa_synology_error_propagates: verifies the bareraiseat line 167 — non-403 SynologyError is re-raised as the original exception type, not wrapped. Checks the error type but doesn't try to assert on the message (good — would be brittle).TestLoginErrorPaths::test_login_succeeds_but_no_sid_raises: checks that DSM returning success+empty data raisesAuthenticationError(match="no session ID"). String-anchored, not just type-anchored — catches the case where the right exception type is raised for the wrong reason.TestLogout::test_logout_synology_error_still_clears_sid: the most important of the three logout tests. Verifies that even when DSM rejects the logout (e.g., session already expired), the localclient.sidis still cleared via thefinallyclause. Best-effort shutdown contract — without this test, a regression could leave stale sid values around after failed logouts.
tests/core/test_formatting.py::test_empty_pairs_with_title — tests the edge case where format_key_value([], title="Stats") is called with empty pairs but a title. Asserts the title appears with an underline of matching length followed by the "No data to display." line. Closes the last 2 missing lines in core/formatting.py.
tests/modules/test_module_system.py::test_comparison_with_non_tier_returns_notimplemented — walks the four if not isinstance(other, PermissionTier): return NotImplemented defensive branches in the PermissionTier comparison operators. Uses four different non-tier types (str, int, None, object()) to exercise diversity, and verifies the actual NotImplemented return value rather than just "doesn't crash." Good defensive testing — this is the reflected-comparison contract that lets Python fall back to the other operand's __lt__/__gt__ etc. when types are mismatched.
Summary of test substance: substantive throughout, no shallow line-walking. Each new test asserts on real behavior or contract, not just presence of a function call. The signal handler closure capture and the patch.dict-internal assertion subtleties are particularly thoughtful.
Code review notes (positive)
- The honest disclosure on
server.py:115is exactly the right shape. The dev considered the alternatives (# pragma: no cover, removing the defensive check) and explained why each was suboptimal. Shows thought rather than hiding-the-99%. - Phase 4 in
addoptsrather than just CI means local pytest runs also enforce the floor — catches regressions in the dev's own loop, not just at PR review time. Right call. - The 1% headroom rationale ("96% with --cov-fail-under=95 gives a 1% buffer") is well-reasoned. Tight enough to catch material regressions, loose enough to absorb small variations from refactors.
- The "remaining 96 uncovered lines" breakdown in the PR description sets accurate expectations for future polish work. The dev has thought about which lines are easy wins vs. which would require significant fixture investment, and is honest that some lines are below the cost/benefit threshold.
- The CHANGELOG entry is at the right level of detail and uses strict Keep a Changelog (
### Added). Consistent with PR #16's convention. - Signal handler test at line 309 is a craftsmanship moment — capturing the inner closure via
signal.signal'sside_effectand invoking it directly walks lines that would otherwise be impossible to test without actually sending a SIGTERM. Same pattern as PR #16's_tools[name].fnextraction. Worth noting as a reusable technique.
Test plan checkboxes (verifiable items)
- CI green on all jobs — confirmed: lint, typecheck, test (3.11/3.12/3.13), version-sync all SUCCESS
- Coverage report shows
server.pyat 99%,auth.py/modules/__init__.pyat 100%, total at ≥95% — verified locally; total 96%, all four target files at the claimed levels - The new
--cov-fail-under=95makes any future PR that drops coverage below 95% fail CI — verified via the pytest output lineRequired test coverage of 95% reached. Total coverage: 96.00% - No production code modified — confirmed via
git diff main..HEAD -- src/(empty); onlypyproject.tomladdopts changed, which is build/test config not production code
Closing #14
This PR closes the parent coverage issue. Full progression across this session:
- PR #16 (Phase 1): 81% → 85%, 5 files to 100%
- PR #17 (Phase 2): 85% → 93%,
cli/version.pyandcli/setup.pyto 100% - PR #19 (Phase 3 + 4): 93% → 96%,
server.pyto 99%,auth.pyandmodules/__init__.pyto 100%, --cov-fail-under=95 enforced
Project is now at 96% coverage with a 95% floor that catches drift in CI and the local dev loop. The remaining 96 uncovered lines are genuinely in defensive helpers, error fallbacks, and infrastructure that would require significant fixture investment for marginal gain — accurate per the PR description.
Summary
Cleanest signoff of the session. No findings. Test substance is excellent, decisions are well-justified, claims are accurate, and the dev's response to my Round 1 reviews on PRs #16/#17 is internalized in this PR's design (the explicit-decision documentation, the strict Keep a Changelog category, the no-shallow-line-walking patterns).
Applying Ready for QA Signoff as the final act. Awaiting maintainer for QA Approved + merge — and that closes #14.
|
QA audit — Round 1 complete (clean) Tests-only Phase 3 + Phase 4 PR. Closes #14. No findings — cleanest signoff of the session. Verification (fresh in this session, commit
Deselected breakdown (per the new feedback rule): 94 = 47 integration + 47 vdsm. Same as prior PRs in this session. Still postponed until ~2026-05-02 per the user's earlier decision to not invest in the vdsm bootstrap yet. Verified the dev's server.py:115 decision by reading the code: line 115 ( Test substance: signal handler closure capture in TestSharedClientManagerLifecycle (line 309) is the centerpiece — patches signal.signal with side_effect to capture handlers, extracts SIGTERM handler, invokes directly, asserts both Applying Ready for QA Signoff as the final act. Awaiting maintainer for QA Approved + merge → closes #14. |
Summary
Closes #14. Bundles Phase 3 (server.py + remaining easy gains) and Phase 4 (CI enforcement) into a single PR. Total coverage 93% → 96% with
--cov-fail-under=95now enforced inpyproject.tomlso future drift fails CI.server.pycore/auth.pymodules/__init__.pycore/formatting.pyTests: 457 → 487 (+30 cases). No production code touched.
Phase 4 enforced:
addopts = "-m 'not integration and not vdsm' --cov=mcp_synology --cov-fail-under=95". Local pytest now ends withRequired test coverage of 95% reached. Total coverage: 96.00%.What's tested
tests/core/test_server.py— three new test classes (28 cases)TestPlatformLabel(3) —_platform_labelreturns"macOS"for Darwin, passes through"Linux"/"Windows"unchangedTestCreateServerInstructionPaths(3) —custom_instructionsis prepended before the base instructions;instructions_filefully replaces the base when readable; falls back to_BASE_INSTRUCTIONSwhen the file is missingTestSharedClientManagerLifecycle(15):with_update_noticeappends-and-clears on first call, passes through on subsequent callsget_clientlazy-init happy path (HTTPS branch covered with a separate test asserting the constructedbase_urlstarts withhttps://), missing-connectionRuntimeError, bg-update-check scheduled whencheck_for_updates=True, skipped whenFalseinstall_cleanup_handlersregistersatexit+ 2 signal handlers; the inner SIGTERM closure is captured and invoked directly to walk lines 130-132 (verifies_cleanup_sessioncalled andSystemExit(128 + signum)raised)_cleanup_sessionreturns early when_auth is None, falls back toasyncio.runwhen no event loop is running, schedules a task on the current loop when one is running, and swallows logout/__aexit__errors as best-effort shutdown_bg_update_checksets_update_noticewhen PyPI returns a newer version, leaves itNonewhen there's no update, and swallowsOSError/ValueError/KeyErrorwithout raisingtests/core/test_auth.py— three new test classesTestDbusSocketMissing— Linux + DBUS unset + socket missing → log debug, do not set the env var (theelsebranch at line 100). AssertsDBUS_SESSION_BUS_ADDRESS not in os.environinside thepatch.dictblock so the assertion sees the patched env, not the restored oneTestLoginErrorPaths— non-2FASynologyErroron login is re-raised as the originalSynologyError(the bareraiseat line 167); DSM returns success with nosid→AuthenticationError("no session ID")at line 171TestLogout— early return whenclient.sid is None, success path clearsclient.sid,SynologyErrorswallowed butclient.sidstill cleared via thefinallyclauseOther test additions
tests/modules/test_module_system.py::test_comparison_with_non_tier_returns_notimplemented— walks the defensiveif not isinstance(other, PermissionTier): return NotImplementedbranches in all fourPermissionTiercomparison operators by calling them directly withstr/int/None/objectargs. Real behavior — verifies the reflected-comparison contract.tests/core/test_formatting.py::test_empty_pairs_with_title—format_key_value([], title="Stats")renders the title with an underline followed by"No data to display."Phase 4 enforcement rationale
With 96% current coverage that's a 1% headroom buffer — enough margin that small future test additions can drop coverage slightly without immediately breaking CI, but tight enough that any material regression fails right away.
The remaining 96 uncovered lines are concentrated in:
client.py(37) — re-auth + retry + multipart upload edge casesconfig.py(14) — validation error pathsEach of those would require significant fixture work for marginal gain. The 95% floor protects against drift while leaving them as future polish.
Server.py's last missing line
server.pyis at 99%, with one line left:if self._client is None: raise RuntimeError("Client initialization failed")at line 115. That's a defensive check immediately afterself._client = client(line 107), making it genuinely unreachable in normal control flow. I considered adding# pragma: no coverper #14's "no cheating" definition, but left it as-is — the file's 99% is informative on its own and the floor is enforced at the project level.Verification
uv run pytest: 487 passed, 94 deselected (was 457, +30)uv run pytest --cov=mcp_synologytotal: 96% (was 93%)Required test coverage of 95% reached. Total coverage: 96.00%uv run ruff check src/ tests/ scripts/: cleanuv run ruff format --check src/ tests/ scripts/: 69 files formatted, cleanuv run mypy src/ scripts/: 29 source files, no issues## Unreleased→### Addedper the conventionTest plan
server.pyat 99%,auth.py/modules/__init__.pyat 100%, total at ≥95%--cov-fail-under=95makes any future PR that drops coverage below 95% fail CI — verifiable by mentally simulating a regressionCloses #14.
🤖 Generated with Claude Code