Skip to content

fix(vdsm): refresh DSM search index after creating test data#67

Merged
cmeans-claude-dev[bot] merged 2 commits into
mainfrom
fix/vdsm-search-flake-via-synoindex
Apr 27, 2026
Merged

fix(vdsm): refresh DSM search index after creating test data#67
cmeans-claude-dev[bot] merged 2 commits into
mainfrom
fix/vdsm-search-flake-via-synoindex

Conversation

@cmeans-claude-dev
Copy link
Copy Markdown
Contributor

@cmeans-claude-dev cmeans-claude-dev Bot commented Apr 27, 2026

Status: round 2 — fixes QA's finding A (root-cause fix, not partial)

QA round 1 verdict: round-1 head shipped a partial fix (synoindex at golden-image-build time improved the test from 6-attempt fail to 4-attempt success), but didn't address the actual root cause — the test creates Bambu Studio at runtime, after the golden image is built, so build-time synoindex can't index a path that doesn't yet exist.

Round 1 → Round 2 deltas

QA's option A (test-fixture-time synoindex) implemented:

  1. New shared SSH helper at tests/vdsm/ssh.py — lifted from setup_dsm.py's private _ssh / _SSH_OPTS so both image-build-time and test-fixture-time SSH go through the same plumbing. setup_dsm.py now imports ssh_exec from the shared module.
  2. New refresh_search_index async fixture in tests/test_integration.py — no-op by default for real-NAS runs (NAS has scheduled indexing). The test calls await refresh_search_index(search_dir) after creating the directory.
  3. vdsm override in tests/vdsm/conftest.py — invokes /usr/syno/bin/synoindex -A -d <path> via SSH on the vdsm container, translating the share-relative path the test uses (e.g. /testshare/Documents/Bambu Studio) into the on-volume path synoindex needs (e.g. /volume1/testshare/Documents/Bambu Studio). Best-effort — non-zero rc is logged but not raised; the test's retry loop is the safety net.

The build-time synoindex calls in setup_dsm.py from round 1 are kept — they still help with the static test data (Documents/, Media/) which exists at image build time.

Why this should hit the success criterion

The failure mode in round 1 was: even with build-time indexing of the Documents/ parent, DSM's incremental indexer takes ~20-40s to register newly-created subdirectories. Round 2 calls synoindex -A -d <new-path> synchronously after the test creates the subdirectory, registering it before the search call. Search should succeed on attempt 1.

If something unexpected happens, the test's existing 6-attempt retry loop is unchanged and falls back to the round-1 behavior (no regression vs current).

Verification gate

I can't run synoindex locally — the live verification is "next vdsm CI run on this PR shows test_search_keyword_finds_directory passing on attempt 1, not attempt 4." If round 2 also doesn't hit attempt 1, we iterate.

Test plan

  • python -c "import ast; ast.parse(open('tests/vdsm/ssh.py').read()); ast.parse(open('tests/vdsm/conftest.py').read()); ast.parse(open('tests/test_integration.py').read())" — all parse
  • uv run pytest — 518 passed (unchanged from round 1), 96.15% coverage. The new refresh_search_index no-op fixture exists in test_integration.py but @pytest.mark.integration tests are deselected on default runs, so no behavioral change to the unit suite.
  • uv run ruff check src/ tests/ scripts/ — clean
  • uv run ruff format --check src/ tests/ scripts/ — clean
  • uv run mypy src/ scripts/ — clean
  • CHANGELOG entry expanded to describe the two-part fix
  • Live verification (post-this-push): vdsm CI run produces a green test_search_keyword_finds_directory on attempt 1.

Files

File Status Purpose
tests/vdsm/ssh.py new shared ssh_exec helper
tests/vdsm/setup_dsm.py modified imports from shared ssh.py; build-time synoindex unchanged
tests/vdsm/conftest.py modified new refresh_search_index fixture (vdsm override)
tests/test_integration.py modified new refresh_search_index no-op fixture; test calls it after create_folder
CHANGELOG.md modified entry expanded to describe two-part fix

@cmeans-claude-dev cmeans-claude-dev Bot added the Ready for QA Dev work complete — QA can begin review label Apr 27, 2026
@github-actions github-actions Bot added Awaiting CI Dev complete, waiting for CI/Codecov to pass before QA and removed Ready for QA Dev work complete — QA can begin review labels Apr 27, 2026
cmeans-claude-dev Bot added a commit that referenced this pull request Apr 27, 2026
@github-actions github-actions Bot added Ready for QA Dev work complete — QA can begin review and removed Awaiting CI Dev complete, waiting for CI/Codecov to pass before QA labels Apr 27, 2026
@codecov-commenter
Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

Fixes the tests/test_integration.py::TestSearch::test_search_keyword_finds_directory
flake on vdsm CI. The test was reliably failing on PR #66's CI run
after 6 retries (65s budget) because DSM Universal Search hadn't
crawled /testshare/Documents/Bambu Studio yet — DSM doesn't index
non-indexed shares promptly on a freshly-booted vdsm.

Adds two synoindex calls in tests/vdsm/setup_dsm.py after the
SSH-driven test data creation:
  /usr/syno/bin/synoindex -A -d /volume1/testshare/Documents
  /usr/syno/bin/synoindex -A -d /volume1/testshare/Media

`synoindex -A -d <path>` registers a directory subtree with DSM's
search index immediately, rather than waiting for the periodic
indexer to scan. With the test data indexed at golden-image build
time, search calls from the test will reliably find the target
without retries.

Best-effort: a non-zero return from synoindex logs a warning and
continues setup. If a hypothetical DSM build doesn't have the
binary at /usr/syno/bin/synoindex, image build still completes
and we just fall back to the pre-existing flake (no regression).

Modifying setup_dsm.py invalidates the vdsm-workflow's golden-image
cache key (keyed on hash of the setup scripts), so the next CI run
rebuilds the image with the fix baked in. After that, every cache
hit gets the indexed state for free.

Verification gate (per playbook): can't run synoindex locally
without a vdsm container; the live verification is "watch the
next vdsm CI run on this PR or a subsequent one and confirm
test_search_keyword_finds_directory passes on the first attempt."
If synoindex doesn't behave as expected we'll see another flake
and iterate — falling back to the pre-existing flake is a
no-regression worst case.
@cmeans-claude-dev cmeans-claude-dev Bot force-pushed the fix/vdsm-search-flake-via-synoindex branch from 2d989d1 to 596cf63 Compare April 27, 2026 01:25
@github-actions github-actions Bot added Awaiting CI Dev complete, waiting for CI/Codecov to pass before QA Ready for QA Dev work complete — QA can begin review and removed Ready for QA Dev work complete — QA can begin review Awaiting CI Dev complete, waiting for CI/Codecov to pass before QA labels Apr 27, 2026
cmeans
cmeans previously approved these changes Apr 27, 2026
Copy link
Copy Markdown
Owner

@cmeans cmeans left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@cmeans cmeans added the QA Active QA is actively reviewing; Dev should not push changes label Apr 27, 2026
@github-actions github-actions Bot removed the Ready for QA Dev work complete — QA can begin review label Apr 27, 2026
Copy link
Copy Markdown
Owner

@cmeans cmeans left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

QA Round 1 — PR #67

CI exercise audit

The vdsm CI on this PR (run 24972241383) is the canonical pre-merge verification of the fix. Investigated whether the cached golden image actually contained the synoindex change:

Question Answer
Did the cache key change vs. main? Yes. gh cache list shows two vdsm-golden-7.2.2-* entries: the main-branch cache (81ea58e5…, 2026-04-14) and a new one for this PR (3219ff6b…, 2026-04-27 01:14:19)
Was the cache populated by a build that included the synoindex fix? Yes. Earlier vdsm runs on this branch (head 2d989d1a, ~01:01) had setup_dsm.py already containing the synoindex calls — the cache built then was reused for the run 3 hit at 01:25
Did the test pass? Yes, on attempt 4 of 6 (at ~01:29:19, ~44s after first attempt at 01:28:40)

The success-criterion gap

Dev's stated success criterion: "test_search_keyword_finds_directory passes on the first attempt without retries".

Actual result: passes on attempt 4 (after 3 retries with cumulative ~37s wait). That's an improvement vs. PR #66 (failed after 6 attempts), but doesn't meet "first attempt".

Findings

1. [substantive] The fix is incomplete: synoindex at golden-image-build time can't pre-index a directory that doesn't yet exist.

The test (tests/test_integration.py:328-...test_search_keyword_finds_directory) creates ${folder}/${keyword} Studio (i.e. /testshare/Documents/Bambu Studio) dynamically at test run time:

# Ensure a directory with the keyword in its name exists.
# DSM search matches file/directory names, not content.
# On real NAS the search_folder likely already has matching content;
# on vdsm we create it here.
search_dir = f"{folder}/{keyword} Studio"

The synoindex calls in tests/vdsm/setup_dsm.py execute during the golden-image build. They can index /testshare/Documents and /testshare/Media because those exist at build time — but the Bambu Studio subdirectory doesn't exist until the test creates it. Whatever priming effect synoindex has on the parent does not eliminate the indexer's async latency for newly-added subdirectories, which is why retries are still needed.

The fix does provide some benefit — likely warming the indexer state for the parent path means it picks up new children faster than from cold. But it doesn't address the actual bug (DSM indexer latency for runtime-created directories).

Two ways forward — pick whichever is preferable:

  • A. Move synoindex to test-fixture time (preferred). After the test creates search_dir, the test (or a fixture wrapping it) calls synoindex -A -d ${search_dir} over SSH from the test runner. That synchronously registers the new subdirectory and should make attempt 1 succeed. Requires SSH plumbing in the test fixture, but the existing setup_dsm.py already has SSH helpers that can be lifted.
  • B. Accept partial fix; restate the success criterion. Update the PR description and CHANGELOG to claim "reduces flake from 6+-attempt failures to 4-attempt successes" rather than "passes on first attempt", and consider raising the test's max retry count from 6 to (say) 8 for safety margin. Less satisfying but ships the improvement now.

A is the root-cause fix. B is the pragmatic fix.

Verification on 596cf63

Check Result
python ast.parse(setup_dsm.py) parses
uv run pytest 518 passed (unchanged from #66), 96.15% coverage
uv run ruff check src/ tests/ scripts/ clean
uv run ruff format --check src/ tests/ scripts/ 69 files clean
Required CI rollup all green (lint, typecheck, test 3.11/3.12/3.13, version-sync, validate-server-json)
vdsm CI overall passed (47/47), but test_search_keyword_finds_directory needed 4 retries
Cache hashFiles invalidation confirmed working — new key 3219ff6b… exists for this PR
No-regression worst case confirmed — best-effort warning on synoindex failure

Verdict

QA Failed pending finding 1 — either iterate to A (move synoindex to test-fixture time) or formally accept the partial-fix scope per B.

The change as-is is a real but partial improvement. Worst-case behavior is no regression. Maintainer's call on whether to iterate or ship as-is with updated framing.

@cmeans
Copy link
Copy Markdown
Owner

cmeans commented Apr 27, 2026

QA audit — round 1

Branch / SHA: `fix/vdsm-search-flake-via-synoindex` @ `596cf63`

Cache invalidation correctness:
```
gh cache list --repo cmeans/mcp-synology --key vdsm-golden-7.2.2

3219ff6b... 2026-04-27T01:14:19 refs/pull/67/merge (NEW — populated during this PR)

81ea58e5... 2026-04-14T02:40:47 refs/heads/main (pre-existing main cache)

```

The hashFiles formula correctly invalidated the cache key when `setup_dsm.py` changed. This PR's branch built a fresh golden image at ~01:14 (run `2d989d1a`) and the current head `596cf63` hit that cache at 01:25.

Test result on the synoindex-baked image:
```
01:28:35 — test_search_keyword_finds_directory STARTED
01:28:40 — search_files attempt 1/6 → 0 results
01:28:52 — search_files attempt 2/6 → 0 results
01:29:04 — search_files attempt 3/6 → 0 results
01:29:19 — search_files attempt 4/6 → results found, test passes
01:29:20 — next test starts
```

Test passed on attempt 4, ~44s after first attempt. Better than PR #66 (failed after 6 attempts) but not "passes on first attempt" per Dev's success criterion.

Root-cause analysis: the test (`test_integration.py:328`) creates `/testshare/Documents/Bambu Studio` at runtime via the API. The synoindex calls in `setup_dsm.py` execute at golden-image build time, before this directory exists, so they can't pre-index it. They prime indexer state for the parent (`/testshare/Documents`), which probably explains the improvement vs. PR #66, but the fundamental bug — DSM indexer latency for runtime-created directories — isn't addressed.

Outcome: QA Failed — see review comment for finding 1 and the two paths forward (move synoindex to test-fixture time, or formally accept partial fix). `uv run pytest` 518/518 pass; required CI all green.

@cmeans cmeans added QA Failed QA found issues — needs dev attention and removed QA Active QA is actively reviewing; Dev should not push changes labels Apr 27, 2026
QA round-1 finding A on PR #67. The setup_dsm.py synoindex call at
golden-image build time helps warm DSM's indexer for static test
data, but the test creates `${folder}/${keyword} Studio` (a
runtime subdirectory) — synoindex at build time can't index a
path that doesn't yet exist. Result on round 1: test improved
from 6-attempt fail to 4-attempt success. Still flaky.

Root-cause fix: register the runtime-created path with DSM's
search index synchronously, right after creation, before the
test searches.

Three changes:

1. New shared SSH helper at tests/vdsm/ssh.py — lifted from
   setup_dsm.py's _ssh / _SSH_OPTS so both image-build-time
   and test-fixture-time SSH share the same plumbing. setup_dsm.py
   now imports ssh_exec from the shared module.

2. New `refresh_search_index` async fixture in tests/test_integration.py
   — no-op by default. Real-NAS runs get the no-op (NAS has
   scheduled indexing); the test calls `await refresh_search_index(
   search_dir)` after creating the directory.

3. Override in tests/vdsm/conftest.py that calls
   `/usr/syno/bin/synoindex -A -d <path>` via SSH on the vdsm
   container. Translates the share-relative path the test uses
   (e.g. /testshare/Documents/Bambu Studio) into the on-volume
   path synoindex needs (e.g. /volume1/testshare/Documents/Bambu
   Studio). Best-effort — non-zero rc is logged but not raised;
   the test's existing retry loop is the safety net.

518 unit tests still pass. Live verification of the vdsm flake
fix has to come from the next CI run on this PR — rebased + cache
invalidation flows through cleanly via setup_dsm.py's already-changed
hash.
@github-actions github-actions Bot added Awaiting CI Dev complete, waiting for CI/Codecov to pass before QA Ready for QA Dev work complete — QA can begin review and removed QA Failed QA found issues — needs dev attention Awaiting CI Dev complete, waiting for CI/Codecov to pass before QA labels Apr 27, 2026
@cmeans cmeans added the QA Active QA is actively reviewing; Dev should not push changes label Apr 27, 2026
@github-actions github-actions Bot removed the Ready for QA Dev work complete — QA can begin review label Apr 27, 2026
Copy link
Copy Markdown
Owner

@cmeans cmeans left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

QA Round 2 — PR #67

Round-1 finding — verified closed on ff4281d

Finding Status Evidence
F1 (incomplete fix — synoindex at build time can't index runtime-created dirs) ✅ closed Dev took option A. New refresh_search_index async fixture (no-op default in test_integration.py, vdsm override in tests/vdsm/conftest.py) calls synoindex -A -d <path> via SSH after the test creates the directory. Build-time synoindex calls retained for the static test data. SSH plumbing lifted to shared tests/vdsm/ssh.py.

Live verification on the round-2 vdsm CI run (run 25003587048)

The cache miss path was exercised — golden image rebuilt with the round-2 fixture wiring. From the log:

15:31:07.45 — test_search_keyword_finds_directory STARTED
15:31:07.68 — Created search target directory: /testshare/Documents/Bambu Studio
15:31:07.84 — synoindex registered /volume1/testshare/Documents/Bambu Studio with DSM search index
15:31:10.92 — search_files attempt 1/6: 1 results found (Bambu Studio)
15:31:10.97 — PASSED

Test passed on attempt 1, ~3.5s after directory creation. Success criterion met.

Code-review highlights

  • tests/vdsm/ssh.py — clean extraction, properly documented, password sanitization via shlex.quote, sudo lecture filtering. Reusable for both image-build and test-fixture contexts.
  • vdsm fixture — translates share-relative path (/testshare/...) to on-volume path (/volume1/testshare/...) which is what synoindex needs. Wraps the sync ssh_exec via asyncio.to_thread for the async test. Best-effort: non-zero rc logs a warning, doesn't raise — test's existing retry loop is the safety net.
  • Default fixture — no-op for real-NAS runs (those NAS units have scheduled indexing; no need to force-register). No-regression for the real-NAS integration test path.
  • Test signature — adds refresh_search_index: Any parameter; calls await refresh_search_index(search_dir) after creation. Single call site, minimal blast radius.

Verification on ff4281d

Check Result
uv run pytest 518 passed, 94 deselected, 96.15% coverage (unchanged — fixture is integration-only, deselected on default runs)
uv run ruff check src/ tests/ scripts/ clean
uv run ruff format --check src/ tests/ scripts/ 69 files clean
uv run mypy src/ scripts/ clean (29 source files)
Python parses (ssh.py, conftest.py, setup_dsm.py, test_integration.py) all parse
Required CI rollup all green (lint, typecheck, test 3.11/3.12/3.13, version-sync, validate-server-json)
vdsm CI on round 2 all green — 47/47 vdsm tests pass; test_search_keyword_finds_directory passes on attempt 1 (live evidence above)
Cache hashFiles invalidation confirmed working — round-2 CI was a cache miss, rebuilt the image with new fixtures baked in; new cache saved as c508ffec…

CHANGELOG entry

Expanded to describe both parts of the two-part fix accurately. Specifically calls out the round-1 → round-2 progression (6-attempt fail → 4-attempt success → attempt-1 success) and notes that the build-time synoindex calls are retained alongside the test-fixture-time addition.

Verdict

Ready for QA Signoff. F1 closed by Dev's option A (root-cause fix). Live CI proved the success criterion on the first run. No findings. QA Approved remains the maintainer's call.

@cmeans
Copy link
Copy Markdown
Owner

cmeans commented Apr 27, 2026

QA audit — round 2

Branch / SHA: `fix/vdsm-search-flake-via-synoindex` @ `ff4281d`

Live vdsm CI evidence (run 25003587048, cache miss → fresh build):

```
15:31:07.45 test_search_keyword_finds_directory STARTED
15:31:07.68 Created search target directory: /testshare/Documents/Bambu Studio
15:31:07.84 synoindex registered /volume1/testshare/Documents/Bambu Studio with DSM search index
15:31:10.92 search_files attempt 1/6: 1 results found (Bambu Studio)
15:31:10.97 PASSED
```

Test passed on attempt 1, ~3.5s after directory creation — success criterion met. 47/47 vdsm tests passed overall.

Local stack:
```
uv run pytest # 518 passed (unchanged), 96.15%
uv run ruff check src/ tests/ scripts/ # clean
uv run mypy src/ scripts/ # clean (29 source files)
```

Outcome: Ready for QA Signoff. F1 closed by option A (test-fixture-time synoindex). `QA Approved` remains the maintainer's call.

@cmeans cmeans added Ready for QA Signoff QA passed — ready for maintainer final review and merge and removed QA Active QA is actively reviewing; Dev should not push changes labels Apr 27, 2026
Copy link
Copy Markdown
Owner

@cmeans cmeans left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@cmeans cmeans added QA Approved Manual QA testing completed and passed and removed Ready for QA Signoff QA passed — ready for maintainer final review and merge labels Apr 27, 2026
@cmeans-claude-dev cmeans-claude-dev Bot merged commit 0433139 into main Apr 27, 2026
34 checks passed
@cmeans-claude-dev cmeans-claude-dev Bot deleted the fix/vdsm-search-flake-via-synoindex branch April 27, 2026 15:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

QA Approved Manual QA testing completed and passed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants