feat: E2E tests in CI against real Paperless-ngx#104
Conversation
🦋 Changeset detectedLatest commit: 02d1278 The changes in this PR will be included in the next version bump. This PR includes changesets to release 1 package
Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughThis PR adds end-to-end testing for the MCP server: a docker-compose Paperless-ngx fixture, a Paperless API client (entity creation, upload, task polling, document-ready polling, minimal PDF), MCP HTTP client helpers, a deterministic Node test suite exercising many MCP tools, a GitHub Actions workflow that runs tests in CLI and Docker modes, a small fix to bulk-edit parameter defaults for tag methods, an npm script to run E2E tests, README docs, and a changeset for a minor release. Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related PRs
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)✅ Unit Test PR creation complete.
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 7
🧹 Nitpick comments (2)
e2e/paperless.ts (1)
95-100: ⚡ Quick winReplace
anyin task response parsing with a typed shape.
(res.data as any)drops type safety in a critical polling path. Define a response interface forresultsvs single task payload and parse against it directly.As per coding guidelines, "Never use
anytype unless absolutely necessary. Instead, use proper TypeScript types."🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@e2e/paperless.ts` around lines 95 - 100, The polling code in e2e/paperless.ts uses (res.data as any) which removes type safety; define a proper Task type (e.g., Task { status: string; related_document?: string; /* other fields */ }) and a wrapper response interface (e.g., TaskListResponse { results?: Task | Task[] }) then change the axios call to axios.get<TaskListResponse>(...) and replace (res.data as any).results with res.data.results so tasks and task are strongly typed (keep the existing variable names res, tasks, task).e2e/e2e.test.ts (1)
292-294: ⚡ Quick winAvoid
anyin tag normalization.Use a narrow union for
tagsitems (number | { id: number }) instead ofanyto keep type-checked assertions.As per coding guidelines, "Never use
anytype unless absolutely necessary. Instead, use proper TypeScript types."🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@e2e/e2e.test.ts` around lines 292 - 294, Change the loose any in the tag normalization to a narrow union and a type-guard: treat docData.tags items as number | { id: number } and update the map callback signature (the tagIds variable logic) to accept (t: number | { id: number }) and return typeof t === "number" ? t : t.id; if docData.tags is an untyped source, add a local typed alias (e.g., type TagItem = number | { id: number }) or cast docData.tags to TagItem[] before mapping so tagIds uses proper TypeScript types instead of any.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In @.github/workflows/e2e.yml:
- Around line 18-37: The workflow uses floating action tags and leaves checkout
credentials persisted; update the checkout step (actions/checkout) to set
persist-credentials: false and replace the floating tags for
actions/setup-node@v6, docker/setup-buildx-action@v3,
docker/build-push-action@v6, and actions/checkout@v4 with their respective full
commit SHA pins (keep the version tag as a trailing comment for readability) so
each action is pinned immutably while preventing GITHUB_TOKEN persistence across
steps.
In `@e2e/client.ts`:
- Around line 23-29: parseToolText currently ignores the ToolResult.isError flag
and may parse error payloads; update the parseToolText function to immediately
throw (with a clear message) if result.isError is true before attempting to find
or JSON.parse the text, referencing the ToolResult type and the result parameter
so callers get a fast-fail on MCP errors rather than producing false-positive
parsed output.
In `@e2e/e2e.test.ts`:
- Around line 300-309: After calling the bulk_edit_documents tool to remove the
tag, add a verification step that fetches the updated document by seedDocumentId
(using the existing document retrieval helper / client.callTool "get_document"
or the test's getDocument/getDocumentById helper) and assert that seedTag.id is
no longer present in the document's tags array; this completes the round-trip by
ensuring the remove_tags operation actually took effect after the
bulk_edit_documents call.
- Around line 224-233: The test calling client.callTool with name
"search_documents" currently only asserts Array.isArray(data.results) which
allows regressions; update the assertion to verify the seeded E2E fixture is
returned by checking data.count equals the expected count and that data.results
includes the expected document ID or metadata (e.g., assert data.results.some(r
=> r.id === EXPECTED_ID) or compare IDs list to an expected array), using the
parsed response from parseToolText(result) (variables: result, data) so the test
fails when the seeded document is missing or counts differ.
In `@e2e/paperless.ts`:
- Around line 38-110: Create a shared axios instance with a timeout (e.g.
10000ms) and use it in getToken, createTag, createCorrespondent,
createDocumentType, uploadDocument and waitForDocument instead of calling axios
directly; in uploadDocument merge form.getHeaders() with the instance headers
(including Authorization) so form boundary persists. Replace the unsafe
(res.data as any).results usage in waitForDocument by declaring a small typed
union/interface for the API response (e.g. PaginatedResponse<T> { results: T[] }
| T) and then check for "results" in res.data (const tasks = 'results' in
res.data ? res.data.results : res.data) to preserve type-safety without using
any. Ensure all axios calls pass through the new instance so the timeout
applies.
In `@README.md`:
- Around line 472-484: The README's prerequisites list is missing jq even though
the TOKEN extraction (TOKEN=$(curl ... | jq -r '.token')) depends on it; update
the Prerequisites section to include jq (e.g., list "jq" alongside Docker and
Docker Compose) and add a brief note that the TOKEN retrieval curl command uses
jq so users must install it beforehand.
- Around line 486-498: Update the cleanup instructions to stop relying on the
shell job specifier `kill %1`; when starting the MCP server in step 4 (the `node
build/index.js --http --port 3001 ... &` line) capture the background process
PID (shell variable $!) into a named variable and then use that captured PID in
step 6 to terminate the server, replacing `kill %1` with an explicit kill of the
stored PID and showing the variable name used so readers can follow the flow.
---
Nitpick comments:
In `@e2e/e2e.test.ts`:
- Around line 292-294: Change the loose any in the tag normalization to a narrow
union and a type-guard: treat docData.tags items as number | { id: number } and
update the map callback signature (the tagIds variable logic) to accept (t:
number | { id: number }) and return typeof t === "number" ? t : t.id; if
docData.tags is an untyped source, add a local typed alias (e.g., type TagItem =
number | { id: number }) or cast docData.tags to TagItem[] before mapping so
tagIds uses proper TypeScript types instead of any.
In `@e2e/paperless.ts`:
- Around line 95-100: The polling code in e2e/paperless.ts uses (res.data as
any) which removes type safety; define a proper Task type (e.g., Task { status:
string; related_document?: string; /* other fields */ }) and a wrapper response
interface (e.g., TaskListResponse { results?: Task | Task[] }) then change the
axios call to axios.get<TaskListResponse>(...) and replace (res.data as
any).results with res.data.results so tasks and task are strongly typed (keep
the existing variable names res, tasks, task).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: ee2c821e-9934-42c4-a4a6-0c091d458ef1
📒 Files selected for processing (8)
.changeset/add-e2e-tests.md.github/workflows/e2e.ymlREADME.mddocker-compose.e2e.ymle2e/client.tse2e/e2e.test.tse2e/paperless.tspackage.json
Closes #101 - docker-compose.e2e.yml: Paperless-ngx 2.14 + Redis fixture stack - e2e/e2e.test.ts: deterministic tools/call tests (no LLM) covering list/create tags, correspondents, document types; list/get/search/ download/thumbnail documents; bulk_edit_documents; post_document - e2e/client.ts: SDK StreamableHTTPClientTransport factory - e2e/paperless.ts: direct Paperless API client for seeding test data - .github/workflows/e2e.yml: matrix job (cli + docker) on PR and main - README: Testing section with unit and E2E run instructions https://claude.ai/code/session_01LActHt6HMhygRppaphFRzQ
CI fixes:
- docker-compose.e2e.yml: change Paperless image tag 2.14 → latest
- e2e/e2e.test.ts: download_document asserted paperless:// URI but the
tool returns a plain filename; assert non-empty URI + blob instead
- .github/workflows/e2e.yml: simplify MCP_URL expression (remove
fragile format() with env.MCP_PORT fallback)
CodeRabbit feedback:
- e2e/client.ts: parseToolText now throws immediately on isError=true
- e2e/e2e.test.ts: search_documents asserts seeded doc is in results;
bulk_edit tag round-trip now verifies removal too; drop `any` in tag
normalization (use TagItem = {id,name})
- e2e/paperless.ts: shared axios instance with 15s timeout on all
calls; replace (res.data as any) with typed TaskResult interface
- README.md: add jq to prerequisites; use captured $MCP_PID instead
of kill %1
https://claude.ai/code/session_01LActHt6HMhygRppaphFRzQ
The Paperless /api/tasks/ endpoint returns TaskResult[] directly (not
{results:[...]}), so waitForDocument was always reading task.status as
undefined, causing a 90s polling timeout on every CI run.
Also update download_document and get_document_thumbnail assertions to
match the paperless:// URI scheme introduced in PR #86.
https://claude.ai/code/session_01LActHt6HMhygRppaphFRzQ
7430504 to
b07df38
Compare
Adding e2e/ TypeScript files caused tsc to use the project root as the common rootDir, shifting the build output from build/index.js to build/src/index.js. The Dockerfile ENTRYPOINT and CI workflows both reference build/index.js, so the Docker smoke test and e2e server startup were silently failing (module not found). Excluding e2e/ restores the original output structure. The e2e tests run via ts-node and do not need tsc compilation. https://claude.ai/code/session_01LActHt6HMhygRppaphFRzQ
The Paperless-ngx tasks API (/api/tasks/?task_id=UUID) uses
StandardPagination, returning {count, results, next, previous}.
The previous fix checked Array.isArray() which returned false for the
paginated object, causing the whole response to be treated as a single
TaskResult with undefined status — looping until the 90s timeout.
Now handles both formats: plain array (older Paperless versions) and
paginated object with a results array (current Paperless-ngx), making
waitForDocument work regardless of Paperless-ngx version.
https://claude.ai/code/session_01LActHt6HMhygRppaphFRzQ
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@docker-compose.e2e.yml`:
- Line 7: Replace the floating image tag
"ghcr.io/paperless-ngx/paperless-ngx:latest" with a pinned version to ensure E2E
determinism—e.g., use "ghcr.io/paperless-ngx/paperless-ngx:2.14.7" or a later
2.14.x tag, or (preferred) pin to the image digest (sha256) for absolute
reproducibility; update the image value in the docker-compose service definition
where "ghcr.io/paperless-ngx/paperless-ngx:latest" appears so CI always pulls
the exact intended build.
In `@e2e/e2e.test.ts`:
- Around line 256-268: The test "returns a resource with image mime type" only
checks the thumbnail URI; add an assertion to verify the returned resource's
MIME type is an image. After locating the resource extraction (variable resource
and r = resource.resource) from the client.callTool("get_document_thumbnail")
result, add an assertion that r.mimeType exists and startsWith("image/") (e.g.,
assert.ok(r.mimeType?.startsWith("image/"), "thumbnail MIME type should be
image/*")) so content-type regressions fail the test.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: a6e33071-2ab1-4e36-b5b1-0fb1c8eecb71
📒 Files selected for processing (9)
.changeset/add-e2e-tests.md.github/workflows/e2e.ymlREADME.mddocker-compose.e2e.ymle2e/client.tse2e/e2e.test.tse2e/paperless.tspackage.jsontsconfig.json
✅ Files skipped from review due to trivial changes (2)
- README.md
- package.json
…in image - Pass original: true in download_document test so it fetches the original file instead of the archive version (which doesn't exist with OCR skipped) - Add mimeType image/* assertion to get_document_thumbnail test - Fix MINIMAL_PDF xref offsets (obj2: 56, obj3: 111, startxref: 180) - Pin paperless-ngx image to 2.14.7 for deterministic CI runs https://claude.ai/code/session_01LActHt6HMhygRppaphFRzQ
- Add 3s pause after document seeding to let Paperless finish all post-processing (search index update, thumbnail generation, etc.) - Capture node:test output to /tmp/e2e-output.txt so it appears in the "Print logs on failure" step, making failures diagnosable https://claude.ai/code/session_01LActHt6HMhygRppaphFRzQ
|
Good questions.
On the alternatives:
I'm happy to refactor toward the MCP-only approach if you prefer it — Generated by Claude Code |
Replace the blind 3-second post-seed pause with active polling via waitUntilSearchable(), which retries GET /api/documents/?query=... until the seeded document appears (up to 60s). This prevents the search_documents test from running before the Whoosh index is updated. Add console.log checkpoints throughout before() so the captured test output in /tmp/e2e-output.txt identifies exactly where failures occur.
Node.js test runner TAP output only shows 'failed running before hook' without the underlying exception. Wrap before() in try/catch and console.error the full stack trace so the real failure is visible in the CI 'Print logs on failure' step output.
…search test
waitUntilSearchable used Paperless full-text search (?query=...) to block
before() until the document appeared in the Whoosh index. If the Whoosh
index update is deferred or fails silently, this caused a 60-second timeout
and a hookFailed error for all 13 tests.
Replace with waitForDocumentReady that polls GET /api/documents/{id}/ (DB,
not Whoosh) and a short fixed delay for post-processing. The search_documents
test now retries up to 6 times with 3-second waits to handle index lag.
…ify_tags Paperless-ngx modify_tags requires both add_tags and remove_tags to be present in the request body. The arrayNotEmpty transform converts [] to undefined, causing fields to be omitted from JSON.stringify. Add includeTagDefaults parameter to buildBulkEditParameters so modify_tags always sends both fields even when the caller passes empty arrays. https://claude.ai/code/session_01LActHt6HMhygRppaphFRzQ
…REST client Per maintainer feedback on PR #104 (review_thread on e2e/paperless.ts): boot an empty Paperless instance and drive the whole flow as a serial MCP scenario — create_tag, create_correspondent, create_document_type, post_document, then list/get/search/download/thumbnail/bulk-edit on the same document. No separate seeding REST client. - Delete e2e/paperless.ts (PaperlessClient + MINIMAL_PDF). - Inline MINIMAL_PDF and consolidate state in e2e.test.ts. - post_document polls list_documents to resolve the async task UUID to a document id, replacing the previous task-status polling. - Tighten assertion messages to include tool error text on failure.
…ons services Aligns the workflow with the commitment in #101 to use GHA service containers natively. The redis + paperless services run with built-in health checks; the job blocks until both report healthy, removing the explicit "Start" + "Wait for Paperless" + "docker compose down" steps. docker-compose.e2e.yml stays in the repo as a convenience for local dev (running the same fixture on a developer machine without copying YAML out of the workflow).
…e job Per review on #104: the per-mode 'if' gates were wider than the shared commands. Collapse to a single job that defines services once, builds once, then runs the test suite twice — once against the CLI MCP, once against the Docker MCP. The Docker steps use 'if: !cancelled()' so a CLI-mode failure still surfaces the Docker-mode result for diagnosis. Each test invocation is a separate node process and uses Date.now() in fixture names, so the two runs do not collide on Paperless state.
The Docker test invocation always failed at post_document with a 60s ingestion poll timeout because Paperless's duplicate detection (by file checksum) silently rejected the second upload in the same job. Both runs shared the exact same MINIMAL_PDF buffer. Append the per-run document title as a trailing PDF comment so the file checksum is unique per run. Trailing data after %%EOF is ignored by PDF parsers but changes the checksum, sidestepping the duplicate filter.
|
Note Unit test generation is a public access feature. Expect some limitations and changes as we gather feedback and continue to improve it. Generating unit tests... This may take up to 20 minutes. |
Switch the Paperless service image from a pinned 2.14.7 tag to :latest so E2E coverage tracks upstream stable releases, and add a weekly schedule (Mondays 08:00 UTC) so drift between this project and the product surfaces without waiting for a PR.
|
✅ Created PR with unit tests: #105 |
When I merged main into the fork branch I took --ours for documents.ts to preserve the resource-reference tool output, but that also dropped main's modify_tags defensive default (introduced in #104 alongside the e2e tests). Without it, the modify_tags Zod schema's arrayNotEmpty transform turns remove_tags: [] into undefined and the payload Paperless receives is missing the field, returning 400 'remove_tags not specified'. Re-applies the exact lines from main: - buildBulkEditParameters gains an includeTagDefaults parameter that initialises add_tags/remove_tags to [] via ??= - The bulk_edit_documents caller passes method === 'modify_tags' for it Unblocks the two failing e2e tests on this PR.
…102) (#109) ## Summary Closes #90. Continuation of #102 (originally by @pipamann), updated for the current `main` and with the three review items addressed. I'm picking this up from the fork branch per @baruchiro's request on #102; the original commits and authorship are preserved. ## What's in this PR The MCP resources feature for documents: - `paperless://documents/{id}/download` — file content (supports `?original=true`) - `paperless://documents/{id}/thumb` — thumbnail - `resources/list` exposes both resources for documents on the first page (capped intentionally — large libraries would otherwise produce unbounded payloads; clients enumerate via `list_documents` and read each resource directly). - `resources/read` lazy-fetches the bytes from Paperless. **Breaking change**: `download_document` and `get_document_thumbnail` no longer embed the file/image as a base64 blob in the tool result. They now return only a resource reference (URI + mime type); clients fetch the bytes via `resources/read`. This is the whole point of #90 — keeps large binaries out of conversation context for clients (n8n LangChain etc.) that accumulate tool results. Hence the changeset is **major**. ## Review items from #102 addressed 1. **changeset bump to major** ([#102 review comment](#102 (comment))) — done; updated description to call out the breaking change. 2. **e2e assertion: tools no longer return inline content** ([#102 review comment](#102 (comment))) — `download_document` and `get_document_thumbnail` e2e tests now assert `r.blob === undefined` and `r.text` is empty/undefined. 3. **e2e coverage for `resources/read` and `resources/list`** ([#102 review comment](#102 (comment))) — added e2e tests for: - `resources/read` on a download URI (`?original=true`) — fetches non-empty file bytes - `resources/read` on a thumb URI — fetches non-empty image bytes with `image/*` mime - `resources/list` — uploaded document's download + thumb URIs appear in the listing Out of scope (per @baruchiro's [request to keep this focused](#102 (comment))): the SSE `/messages` auth gap and the Paperless v3+ 406 `Accept` header issue flagged in #102's description are **not** addressed here — both deserve dedicated issues. ## Conflict resolution against `main` `main` advanced since #102 was opened (#86 landed, then #104 added the e2e suite). Three textual + one semantic conflict resolved: - `src/tools/utils/resourceUri.ts`, `resourceUri.test.ts` — kept the canonical filename-free URI design. Main's #86 shape (filename in query string) is superseded by this PR's design per #90. - `src/tools/documents.ts` — kept the resource-reference tool output (no inline blob). - `e2e/e2e.test.ts` — the e2e `download_document` test from #104 asserted on the inline `blob`. Rewrote it to match the new design (assert no inline content, then call `resources/read` for the bytes). - `package-lock.json` — synced to 0.5.1 (`npm install` corrected it after the merge). Also dropped a stale `.changeset/fix-resource-uri-validation.md` that described #86's work — already consumed by Version Packages #103. ## Verification - [x] `npm run build` clean (after rebuild on the new tsconfig) - [x] `npm test` — 15/15 pass - [x] `node --test src/tools/utils/resourceUri.test.ts` — 9/9 pass directly (the `npm test` globstar issue is pre-existing and not in scope) - [x] e2e file standalone typecheck clean (e2e is excluded from `tsconfig`) - [ ] e2e suite against a real Paperless-ngx instance — not run locally; relies on the CI workflow added in #104 ## Closes / supersedes - Closes #90 - Supersedes #102 (please close in favor of this; PR #102 is on a fork branch I can't push to from the sandbox) --- _Generated by [Claude Code](https://claude.ai/code/session_019qRbPR4y4Xfxwm9co6tAyd)_ <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * **New Features** * Added MCP resources for Paperless documents: download and thumbnail URIs (paperless://...) exposed via resources/list and retrievable via resources/read; resource metadata includes a human-readable filename. * **Breaking Changes** * Download and thumbnail tools now return resource references (URIs and mime type) instead of embedded/base64 content. Clients must call resources/read to fetch bytes. Download URIs support ?original=true. * **Documentation** * Updated MCP docs to describe new URIs and client behavior. * **Tests** * Added/updated E2E and integration tests enforcing URI-only responses and lazy reads. <!-- review_stack_entry_start --> [](https://app.coderabbit.ai/change-stack/baruchiro/paperless-mcp/pull/109?utm_source=github_walkthrough&utm_medium=github&utm_campaign=change_stack) <!-- review_stack_entry_end --> <!-- end of auto-generated comment: release notes by coderabbit.ai --> --------- Co-authored-by: Leon Heuser <leon@dror.de> Co-authored-by: Claude <noreply@anthropic.com>
Closes #101
Summary
docker-compose.e2e.yml— Paperless-ngx 2.14 + Redis fixture stack with auto-created admin usere2e/e2e.test.ts— Deterministictools/calltests (no LLM) using Node's nativenode:testcovering all tools from the issue:list_tags,create_tag,list_correspondents,create_correspondent,list_document_types,create_document_type,list_documents,get_document,search_documents,download_document(regression for download_document and get_document_thumbnail return 'malformed response' (resource URI missing scheme) #87),get_document_thumbnail,bulk_edit_documents(regression for fix: strip MCP-only params from bulk_edit delete request to Paperless #100, Fix bulk custom field parameters #89),post_documente2e/client.ts— SDKStreamableHTTPClientTransportfactorye2e/paperless.ts— Direct Paperless REST client for seeding test data inbefore()hooks.github/workflows/e2e.yml— Matrix job (CLI + Docker) triggered on every PR and push tomainREADME.md— Testing section with unit and E2E run instructionspackage.json—npm run test:e2escriptTest plan
list_tagsfinds the seeded tagdownload_documentreturns apaperless://URI (regression download_document and get_document_thumbnail return 'malformed response' (resource URI missing scheme) #87)bulk_edit_documentsmodify_tags round-trip works (regression fix: strip MCP-only params from bulk_edit delete request to Paperless #100)https://claude.ai/code/session_01LActHt6HMhygRppaphFRzQ
Generated by Claude Code
Summary by CodeRabbit
Tests
Documentation
Chores
Bug Fixes