Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,8 @@

### Added

- **ADR 0001 — Per-Client DSM Sessions and the Streamable HTTP Roadmap** (#PR_PLACEHOLDER) — closes #47. The 2026-04-16 project-wide review flagged the largest spec-vs-code gap: `docs/specs/architecture.md` documented a `session_key` parameter on `AuthManager.get_session()` for per-MCP-client DSM sessions under Streamable HTTP, but the live signature was `get_session() -> str` with no key and no plumbing. Three options were considered — ship the multi-session path now, retract the spec, or formally defer with the helper restructured. This PR records the deferral (Option 3) and closes the spec drift. New `docs/adr/0001-per-client-dsm-sessions.md` captures the question, options-with-tradeoffs, decision, consequences, and five concrete revisit triggers (Streamable HTTP gaining a concrete plan, multi-tenant deployment use case, subagent isolation request, tool-surface roughly doubling, single-shared-session pain showing up under real load). The ADR also inventories what the next implementation step looks like when a trigger fires, so the deferred work has a clear bounded shape rather than an open-ended re-design. Spec changes: `docs/specs/architecture.md` "Auth Manager" example reverted to the live `get_session()` signature with a forward-pointer to the planned section + ADR; the dedicated section renamed "Future" → "Planned: Per-Client Sessions (Streamable HTTP)" with an explicit deferral-not-oversight statement and a back-reference to the ADR.

- **publish.yml: gate PyPI publish on `server.json` registry-schema validation** (#89) — closes #44. Adds a `validate-server-json` job that runs before `publish-pypi` on every release tag, mirroring the same check that runs on every PR via `ci.yml`. Without this gate, a malformed `server.json` (new required field in a future registry schema, type change, etc.) would PyPI-publish cleanly and then fail at the registry leg — leaving a discoverable PyPI release that isn't in the MCP registry, with no way to roll back PyPI short of yanking. v0.5.1 hit the registry-leg-fails-after-PyPI scenario for a *different* reason (mcp-publisher OIDC audience mismatch, fixed in #79), so the failure mode isn't theoretical. The new job uses the same `./.github/actions/install-mcp-publisher` composite that `publish-registry` uses, so the validating publisher version always matches the publishing one. Job runs in parallel with `build` (no dep on it), and `publish-pypi` now lists both as `needs:` so a validation failure stops the entire pipeline before any external side-effect. The optional `publish-manifest` artifact named in #44 is deferred — it's a tooling-version reconciliation aid that's only useful if a CI-PR / release-tag mismatch *does* fire, and is straightforward to add later when there's a concrete failure to debug.

- **`additional` parameter exposed on `list_shares`, `list_files`, and `search_files` MCP tools** (#87) — closes #41. The underlying handlers in `modules/filestation/listing.py` and `modules/filestation/search.py` already accepted an `additional: list[str] | None` parameter, but the FastMCP tool registrations in `modules/filestation/__init__.py` never surfaced it — callers couldn't request DSM metadata fields beyond the per-tool defaults. Same class of spec-vs-tool drift as #33 (which added `mtime_from`/`mtime_to` to `search_files`). Now the three tool registrations accept `additional: list[str] | None = None`, with a docstring per tool listing the supported field set; the underlying defaults (`["real_path", "size", "owner", "perm"]` for shares, `["size", "time"]` for files and search) are unchanged so existing callers see no behavior difference. New `validate_additional()` helper in `modules/filestation/helpers.py` rejects unknown values up-front against a union whitelist (`real_path`, `size`, `owner`, `time`, `perm`, `type`, `mount_point_type`, `volume_status`) — DSM silently ignores unknown values (the field just doesn't appear in the response), which would have made typos like `"sze"` invisible; validating up-front surfaces the typo as a clear ToolError naming the bad value and listing the supported set. The whitelist is a union across List + Search APIs; tighter per-API gating isn't worth the maintenance cost since DSM ignores values that don't apply (e.g. `mount_point_type` on a non-share path) without erroring. New tests: 12 cases in `TestValidateAdditional` (7 valid/empty accepted, 4 unknown rejected with clear errors, 1 verifying the tool name appears in the error envelope) plus six round-trip + reject-unknown tests across `TestListShares`, `TestListFiles`, and `TestSearchFiles`. The Search round-trip captures the `list` (poll) request, not `start` — DSM Search returns full file metadata at poll time, so `additional` is sent there. 599 unit tests pass at 96.25% coverage.
Expand Down
76 changes: 76 additions & 0 deletions docs/adr/0001-per-client-dsm-sessions.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,76 @@
# ADR 0001 — Per-Client DSM Sessions and the Streamable HTTP Roadmap

- **Status:** Accepted (deferred-stub) — 2026-05-04
- **Issue:** [#47](https://github.com/cmeans/mcp-synology/issues/47)
- **Implementing PR:** _filled in on merge_

## Context

`docs/specs/architecture.md` documents a future per-MCP-client DSM session model: each MCP client gets its own DSM session, scoped by a `session_key` parameter threaded through `AuthManager.get_session(session_key)`. The current implementation does not support this — `get_session()` takes no key, and the session name is built once at construction time and reused for the life of the `AuthManager`.

So long as the server runs under stdio (one OS process per MCP client connection), the gap is invisible: one process = one client = one DSM session. But under the planned Streamable HTTP transport, one server process would serve many MCP clients concurrently. They would all share the same `AuthManager` and therefore the same DSM session, meaning operations from different clients would interleave on the NAS as if they were one user, and DSM session expiry from one client's idleness would log out everyone else.

The 2026-04-16 project-wide review flagged this as the single biggest spec-vs-code gap. It is not a bug — there are no Streamable HTTP users today. It is an **unmade decision** that has been documented as future. The choice was either to ship the multi-session path now, retract the spec, or formally defer with the helper restructured so the eventual change is bounded. This ADR records the deferral.

## Options Considered

### Option 1 — Ship the multi-session path now

Add `session_key` to `get_session()`, change `_build_session_name()` to derive per key, plumb the key through every tool handler down to the auth layer, decide how clients identify themselves, and stand up a session pool with lifecycle (idle GC, max-concurrent cap, per-key re-auth coordination).

**Pros:** Streamable HTTP enablement becomes a transport-config change, not a refactor. Refactor cost grows with the tool surface — doing it now is bounded; doing it later, after #48/#49/#50 add modules, costs more. Forces design clarity on session-pool questions while context is fresh.

**Cons:** Real refactor for a feature with zero current users. Maintenance tax — every future module author has to remember to thread `session_key`, and a single forgotten call site silently regresses to the shared-session model. YAGNI risk if the trigger never fires.

### Option 2 — Retract the spec

Delete the per-client-sessions section. Stop documenting a direction that isn't being built. When/if the trigger lands, re-derive the design.

**Pros:** Honest about current state. Smallest change. No code maintenance tax.

**Cons:** Throws away the design thinking that's already in the spec. The next person to address Streamable HTTP has to reconstruct from scratch. Loses an architectural signpost.

### Option 3 — Deferred stub + spec reframe (chosen)

Don't change runtime behavior. Make the private session-name helper (`_build_session_name`) accept an optional `session_key: str | None = None` parameter so the future change is a one-liner at the helper. Keep `get_session()` single-session — no public-API surface change today. Reframe the spec section from "Future" to "Planned for Streamable HTTP enablement" and link it back to this ADR. Record the question, options, decision, and revisit triggers here.

**Pros:** Spec and code agree (within the deferral framing). Helper is structurally ready — when the trigger fires, the change is to add a `session_key` parameter to `get_session`, wire a session pool keyed on it, and call `self._build_session_name(session_key=key)`. Design intent preserved. No maintenance tax (no new public API to keep stable). No behavior change.

**Cons:** Half-measure: the spec says "planned" but there is no scheduled implementation date and no acceptance criteria beyond the trigger conditions below. Future implementer still has to design the session-pool lifecycle; this ADR records that the design is deferred, not solved.

## Decision

**Option 3.** Refactor the helper, reframe the spec, leave the public surface and runtime behavior unchanged.

Concretely:

1. `AuthManager._build_session_name(session_key: str | None = None) -> str` accepts an optional key. With `session_key=None` it behaves as today (`MCPSynology_{instance_id}_{uuid8}`). With a non-None key it returns `MCPSynology_{instance_id}_{session_key}`. The constructor still calls `_build_session_name()` with no key, so every existing code path behaves identically.
2. `AuthManager.get_session()` signature unchanged. Single-session.
3. The "Future: Per-Client Sessions" section in `docs/specs/architecture.md` is renamed "Planned: Per-Client Sessions" and links back to this ADR. The Auth Manager interface example is updated to match the current `get_session()` signature, with a sentence pointing at the planned section + this ADR.
4. This ADR records the question, options, decision, and revisit triggers.

## Consequences

- **No production behavior change.** Every existing test passes unchanged. The only signature change is the addition of an optional argument with a default that preserves prior behavior.
- **Spec drift closed.** The spec no longer documents an interface the code lacks. Future readers see "planned + see ADR" instead of "future + signature in code that doesn't exist".
- **The next implementation step is bounded.** When a revisit trigger fires (see below), the work is: (a) add `session_key: str | None = None` to `get_session()`, (b) replace `self._session_name`/`self._client.sid` with a keyed pool, (c) thread the key through callers that need it, (d) decide and document the session-pool lifecycle (idle GC, max-concurrent cap, per-key re-auth, callback dispatch). The helper change is already done.
- **Per-client sessions remain unavailable today.** Any caller wanting per-client isolation under stdio (subagents asking for separate DSM sessions, multi-user wrapper scripts, isolation-in-tests) will not get it from this PR. They are out of scope for the deferred stub.

## Revisit Triggers

This ADR should be re-opened — and Option 1 (or a variant) revisited — when **any** of the following happens:

1. **Streamable HTTP transport gains a concrete implementation plan.** Either an internal decision to build it or an upstream MCP SDK release that makes it the path of least resistance.
2. **A multi-tenant deployment use case appears.** Someone (Chris, a contributor, or a downstream user) wants to run mcp-synology as a shared service where multiple MCP clients connect with different identities.
3. **Subagent isolation is requested.** A concrete need to run multiple parallel DSM sessions from one process — e.g., a Claude Code session where the main agent and a long-running search subagent should not share session expiry.
4. **A second module is added that significantly increases the tool surface.** The argument "refactor cost grows linearly with the tool surface" is empirical; once the surface roughly doubles (e.g., adding a Storage + Health module per #49 or a Notifications module), the cost-of-deferring side of the trade re-balances. At that point even without an external trigger, doing the threading proactively may be cheaper than waiting.
5. **The current single-shared-session model exhibits a real-world pain point.** E.g., session expiry from one logical caller stalling unrelated calls under genuine concurrent load, or DSM rate limits being hit because one shared session can't be parallelized.

When a revisit happens, the Option 1 work is: define the client-identity contract (transport-supplied vs caller-supplied), spec the session-pool lifecycle (max sessions, idle GC, eviction on auth failure), decide credentials policy (one set of creds shared across keys, or keyed credential lookup), and update this ADR with the new decision.

## Related

- `src/mcp_synology/core/auth.py` — `_build_session_name`, `get_session`, `logout`, `_re_authenticate`
- `docs/specs/architecture.md` — "Auth Manager" and "Planned: Per-Client Sessions (Streamable HTTP)" sections
- Issue [#47](https://github.com/cmeans/mcp-synology/issues/47) — the open question this ADR answers
- Issues [#48](https://github.com/cmeans/mcp-synology/issues/48) / [#49](https://github.com/cmeans/mcp-synology/issues/49) / [#50](https://github.com/cmeans/mcp-synology/issues/50) — feature work whose design is unblocked by this deferral being recorded
21 changes: 10 additions & 11 deletions docs/specs/architecture.md
Original file line number Diff line number Diff line change
Expand Up @@ -61,16 +61,13 @@ Owns session lifecycle and credential retrieval. See [Auth Architecture](#auth-a

```python
class AuthManager:
async def get_session(self, session_key: str | None = None) -> str:
"""Get a valid DSM session ID.

session_key: Optional key to scope DSM sessions.
If None, uses the default (shared) session.
For Streamable HTTP, pass the MCP session_id for per-client sessions.
"""
async def get_session(self) -> str:
"""Get a valid DSM session ID, logging in if needed."""
...
```

The current signature is single-session, matching the stdio transport's one-process-per-client model. A `session_key: str | None = None` parameter is reserved for the future per-client session model under Streamable HTTP — see [Planned: Per-Client Sessions](#planned-per-client-sessions-streamable-http) below and [ADR 0001](../adr/0001-per-client-dsm-sessions.md) for the deferral decision and revisit triggers.

#### Config System (`core/config.py`)

**Responsibilities:**
Expand Down Expand Up @@ -309,21 +306,23 @@ When a module makes an API call and the DSM client receives a session error, re-

**Keepalive strategy:** Lazy re-auth (let sessions expire, re-authenticate on next request) rather than proactive keepalive pings. Simpler, no unnecessary traffic, and re-auth is fast (single HTTP call).

#### Future: Per-Client Sessions (Streamable HTTP)
#### Planned: Per-Client Sessions (Streamable HTTP)

Under stdio transport, one process = one MCP session = one DSM session shared across all chats. This is fine — DSM sessions handle concurrent API calls.

Under Streamable HTTP (roadmapped), multiple MCP clients connect to one server, each with its own MCP session and `session_id`. The auth manager's `session_key` parameter enables per-client DSM sessions:
Under Streamable HTTP (planned, not yet implemented), multiple MCP clients would connect to one server, each with its own MCP session and `session_id`. The auth manager would scope DSM sessions per client via a `session_key` parameter on `get_session()`:

```python
# Stdio (current): single shared session
sid = await auth_manager.get_session()

# Streamable HTTP (future): per-MCP-client session
# Streamable HTTP (planned): per-MCP-client session
sid = await auth_manager.get_session(session_key=ctx.session_id)
```

The DSM session name becomes `MCPSynology_{instance_id}_{mcp_session_id}`, giving each connecting client its own isolated DSM session.
The DSM session name would become `MCPSynology_{instance_id}_{mcp_session_id}`, giving each connecting client its own isolated DSM session. The session-name helper (`AuthManager._build_session_name`) already accepts an optional `session_key` argument so the future change is bounded; the public `get_session()` signature, the session pool, and the lifecycle (login/logout/GC) per key are deferred until a transport or multi-tenant trigger lands.

This is a deliberate deferral, not an oversight. See [ADR 0001 — Per-Client DSM Sessions](../adr/0001-per-client-dsm-sessions.md) for the question, options considered, decision, and the trigger conditions that would re-open it.

### Setup / Bootstrap Flow

Expand Down
20 changes: 16 additions & 4 deletions src/mcp_synology/core/auth.py
Original file line number Diff line number Diff line change
Expand Up @@ -69,11 +69,23 @@ def add_on_reauth_callback(self, cb: Callable[[], None]) -> None:
"""
self._on_reauth_callbacks.append(cb)

def _build_session_name(self) -> str:
"""Build a unique DSM session name: MCPSynology_{instance_id}_{uuid}."""
def _build_session_name(self, session_key: str | None = None) -> str:
"""Build a DSM session name.

``session_key=None`` (current default) produces
``MCPSynology_{instance_id}_{uuid8}`` — one shared session per
``AuthManager`` instance, which is the only mode exercised today
under the stdio transport.

``session_key=<str>`` produces ``MCPSynology_{instance_id}_{session_key}``
for the planned per-client session model under Streamable HTTP. The
helper accepts a key now so a future per-client session pool can derive
names without changing this method's contract; no production code path
invokes it with a non-None key today. See ADR 0001.
"""
instance_id = self._config.instance_id or "default"
unique_id = uuid.uuid4().hex[:8]
return f"MCPSynology_{instance_id}_{unique_id}"
suffix = session_key if session_key is not None else uuid.uuid4().hex[:8]
return f"MCPSynology_{instance_id}_{suffix}"

def _resolve_credentials(self) -> tuple[str, str, str | None]:
"""Resolve credentials from the storage hierarchy.
Expand Down
Loading