Skip to content

feat: add get_tag tool and storage_paths CRUD#108

Open
neocult-de wants to merge 4 commits into
baruchiro:mainfrom
neocult-de:feat/get-tag-and-storage-paths
Open

feat: add get_tag tool and storage_paths CRUD#108
neocult-de wants to merge 4 commits into
baruchiro:mainfrom
neocult-de:feat/get-tag-and-storage-paths

Conversation

@neocult-de
Copy link
Copy Markdown

@neocult-de neocult-de commented May 25, 2026

Closes #107.

Adds the two missing tool families surfaced in #107.

Changes

get_tag (commit 1)

Brings tags in line with correspondents, document_types and custom_fields, all of which already expose both list_* and get_* variants. Without get_tag agents fell back to list_tags(name__iexact="…"); the asymmetry was a foot-gun (agents try get_tag first and fail).

storage_paths CRUD tool family (commit 2)

Storage paths were not exposed at all — they appeared only as a foreign-key parameter (storage_path) on list_documents, post_document, update_document and bulk_edit_documents, so agents had no way to discover or manage them.

The new tools mirror the correspondents/document_types pattern:

  • list_storage_paths
  • get_storage_path
  • create_storage_path (with required Django-template path field)
  • update_storage_path
  • delete_storage_path (confirm: true discipline, same as delete_correspondent)
  • bulk_edit_storage_paths

This also closes the self-evidence gap: the existing list_documents tool description already pointed agents at a list_storage_paths tool that previously did not exist.

Tests

All new functionality is covered by E2E tests in e2e/e2e.test.ts, following the existing pattern (real Paperless-NGX via docker-compose.e2e.yml, MCP server spawned via HTTP transport, state threaded through the describe block).

New it() blocks:

  • get_tag returns the tag by ID with full detail fields
  • create_storage_path creates a storage path and returns it with an id
  • get_storage_path returns the storage path by ID with full detail fields
  • list_storage_paths returns the storage path created earlier in this run
  • update_storage_path renames the storage path and the change is visible via get
  • bulk_edit_documents set_storage_path assigns the storage path and get_document reflects it (integration with documents)
  • delete_storage_path requires confirm=true and then removes the storage path (asserts `confirm: false` returns `isError`, confirm-true succeeds, and a subsequent `get_storage_path` round-trips a 404)

Local run against `docker compose -f docker-compose.e2e.yml up` (paperless-ngx 2.14.7): 21/21 tests pass (13 existing + 8 new).

```

tests 21

pass 21

fail 0

```

Diff stat

```
e2e/e2e.test.ts | 196 ++++++++++++++++++
src/api/PaperlessAPI.ts | 43 ++++
src/api/types.ts | 17 ++
src/server.ts | 2 +
src/tools/storagePaths.ts | 203 ++++++++++++++++++++
src/tools/tags.ts | 14 ++
6 files changed, 475 insertions(+)
```

Split into two commits so each feature can be reviewed (or reverted) independently.

Summary by CodeRabbit

  • New Features

    • Manage storage paths: create, view (detailed), list, update, and delete (destructive actions require confirmation)
    • Bulk edit: assign storage paths to documents
  • Improvements

    • Enhanced tag details when viewing a tag (includes slug and expanded matching algorithm)
  • Tests

    • End-to-end tests covering storage-path workflows, tag retrieval, document integration, and deletion behavior verification

Review Change Stack

Florian Henze added 2 commits May 26, 2026 01:43
Adds a get_tag detail-getter to bring tags in line with correspondents,
document_types and custom_fields, all of which already expose both
list_* and get_* variants. Without get_tag, agents have to fall back to
list_tags(name__iexact=...) and the asymmetry shows up as a foot-gun
(agents try get_tag first and fail).

The tool returns the same Tag payload as the other detail-getters, with
matching_algorithm enhanced from a numeric id to {id, name} via the
existing enhanceMatchingAlgorithm helper.

E2E coverage in e2e.test.ts asserts that a freshly-created tag is
returned by get_tag with id, name, non-empty slug, and the expanded
matching_algorithm shape.
Storage paths were not exposed at all — no list/get/create/update/delete
and no bulk_edit. They appeared only as a foreign-key parameter
(storage_path) on list_documents, post_document, update_document and
bulk_edit_documents, so agents had no way to discover or manage them
via the MCP and had to fall back to raw API calls.

The new tool family mirrors the correspondents/document_types pattern:
list_storage_paths, get_storage_path, create_storage_path,
update_storage_path, delete_storage_path (with the same confirm-flag
discipline as delete_correspondent) and bulk_edit_storage_paths.

Storage paths carry an additional required `path` field, a Django
template string such as "{{ correspondent }}/{{ created_year }}/{{ title }}".
The tool descriptions document this and reference the Paperless-NGX docs
for available placeholders.

E2E coverage in e2e.test.ts walks the full lifecycle:
- create_storage_path returns the new path with id, name and template
- get_storage_path returns the same payload with expanded matching_algorithm
- list_storage_paths includes the new id
- update_storage_path renames and preserves the path template
- bulk_edit_documents method=set_storage_path assigns the path to the
  uploaded test document, and get_document reflects the assignment
- delete_storage_path requires confirm=true; the deleted id then
  surfaces an isError from get_storage_path (404 round-trip)

Side note: this also closes the self-evidence gap from the existing
list_documents description, which already points agents at a
list_storage_paths tool that previously did not exist.
@changeset-bot
Copy link
Copy Markdown

changeset-bot Bot commented May 25, 2026

🦋 Changeset detected

Latest commit: 50c6e33

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 1 package
Name Type
@baruchiro/paperless-mcp Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 25, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: fc688915-3e09-4f4d-8734-40e59bbcbb4a

📥 Commits

Reviewing files that changed from the base of the PR and between 943aa6a and 50c6e33.

📒 Files selected for processing (2)
  • e2e/e2e.test.ts
  • src/server.ts
🚧 Files skipped from review as they are similar to previous changes (2)
  • src/server.ts
  • e2e/e2e.test.ts

📝 Walkthrough

Walkthrough

Adds StoragePath types and PaperlessAPI methods for tag detail and full storage-path CRUD/list. Implements MCP tools exposing list/get/create/update/delete/bulk_edit storage-path operations plus a get_tag tool, wires them into server initialization, and adds e2e tests covering tag detail expansion, storage-path lifecycle, document integration via bulk edit, and deletion confirmation/error flows.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Suggested reviewers

  • baruchiro
🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately and concisely summarizes the two main changes: adding a get_tag tool and storage_paths CRUD operations.
Linked Issues check ✅ Passed All linked issue objectives are met: get_tag tool added with proper detail field expansion, storage_paths CRUD fully implemented with list, get, create, update, delete, and bulk_edit operations with proper validation and confirmation handling.
Out of Scope Changes check ✅ Passed All changes are directly aligned with the linked issue objectives—no unrelated modifications detected in the codebase.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Warning

There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure.

🔧 ESLint

If the error stems from missing dependencies, add them to the package.json file. For unrecoverable errors (e.g., due to private dependencies), disable the tool in the CodeRabbit configuration.

ESLint skipped: no ESLint configuration detected in root package.json. To enable, add eslint to devDependencies.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@e2e/e2e.test.ts`:
- Around line 284-301: The test "list_storage_paths returns the storage path
created earlier in this run" is flaky due to pagination; modify the
client.callTool invocation for "list_storage_paths" (the call in the it block
using client.callTool and parseToolText) to request a deterministic result by
either passing a filter for the storage path name (use RUN_STORAGE_PATH) or
increasing page_size to a sufficiently large value, then assert against the
filtered result (e.g., check data.results[0] or find by id as before) rather
than relying on the default page. Ensure you update the arguments object passed
to client.callTool so the API returns the created storage path reliably when
checking state.storagePathId and RUN_STORAGE_PATH.

In `@src/tools/storagePaths.ts`:
- Around line 97-103: The update_storage_path tool schema currently requires
fields like name and path which prevents PATCH-style partial updates; modify the
Zod input schema for update_storage_path to make updatable fields optional
(e.g., change name and path and any matchingPattern/matchingAlgorithm fields in
the same schema to z.string().optional() or the appropriate optional Zod types)
and ensure id remains required, and provide sensible defaults/validation where
needed so callers can send partial updates without server-side reassignment
logic.
- Around line 189-199: Normalize args.permissions before passing to
api.bulkEditObjects to avoid sending empty arrays/objects: import and use the
arrayNotEmpty and objectNotEmpty helpers from tools/utils/empty.ts in
storagePaths.ts, compute a normalizedPermissions value (e.g., call
arrayNotEmpty(args.permissions) and/or objectNotEmpty(args.permissions) so empty
collections become undefined) and replace direct use of args.permissions in the
payload for the "set_permissions" branch of api.bulkEditObjects; keep owner and
merge as-is but only include permissions when normalizedPermissions is not
undefined.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: b52f77ef-1625-487b-9ed6-962301a51e87

📥 Commits

Reviewing files that changed from the base of the PR and between 9d677b2 and 78b29b9.

📒 Files selected for processing (6)
  • e2e/e2e.test.ts
  • src/api/PaperlessAPI.ts
  • src/api/types.ts
  • src/server.ts
  • src/tools/storagePaths.ts
  • src/tools/tags.ts

Comment thread e2e/e2e.test.ts
Comment on lines +284 to +301
it("list_storage_paths returns the storage path created earlier in this run", async () => {
assert.ok(state.storagePathId, "storage path must be created first");
const result = (await client.callTool({
name: "list_storage_paths",
arguments: {},
})) as ToolResult;
assertOk(result, "list_storage_paths");
const data = parseToolText(result) as {
results: { id: number; name: string }[];
};
assert.ok(Array.isArray(data.results), "results should be an array");
const found = data.results.find((sp) => sp.id === state.storagePathId);
assert.ok(
found,
`storage_path id=${state.storagePathId} not found in list_storage_paths`
);
assert.strictEqual(found.name, RUN_STORAGE_PATH);
});
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Make list_storage_paths lookup deterministic to avoid pagination flakiness.

This test can fail when the created item is outside the default page. Filter by name (or set an explicit large page_size) before asserting presence.

💡 Suggested change
   const result = (await client.callTool({
     name: "list_storage_paths",
-    arguments: {},
+    arguments: { name__iexact: RUN_STORAGE_PATH, page_size: 200 },
   })) as ToolResult;
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
it("list_storage_paths returns the storage path created earlier in this run", async () => {
assert.ok(state.storagePathId, "storage path must be created first");
const result = (await client.callTool({
name: "list_storage_paths",
arguments: {},
})) as ToolResult;
assertOk(result, "list_storage_paths");
const data = parseToolText(result) as {
results: { id: number; name: string }[];
};
assert.ok(Array.isArray(data.results), "results should be an array");
const found = data.results.find((sp) => sp.id === state.storagePathId);
assert.ok(
found,
`storage_path id=${state.storagePathId} not found in list_storage_paths`
);
assert.strictEqual(found.name, RUN_STORAGE_PATH);
});
it("list_storage_paths returns the storage path created earlier in this run", async () => {
assert.ok(state.storagePathId, "storage path must be created first");
const result = (await client.callTool({
name: "list_storage_paths",
arguments: { name__iexact: RUN_STORAGE_PATH, page_size: 200 },
})) as ToolResult;
assertOk(result, "list_storage_paths");
const data = parseToolText(result) as {
results: { id: number; name: string }[];
};
assert.ok(Array.isArray(data.results), "results should be an array");
const found = data.results.find((sp) => sp.id === state.storagePathId);
assert.ok(
found,
`storage_path id=${state.storagePathId} not found in list_storage_paths`
);
assert.strictEqual(found.name, RUN_STORAGE_PATH);
});
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@e2e/e2e.test.ts` around lines 284 - 301, The test "list_storage_paths returns
the storage path created earlier in this run" is flaky due to pagination; modify
the client.callTool invocation for "list_storage_paths" (the call in the it
block using client.callTool and parseToolText) to request a deterministic result
by either passing a filter for the storage path name (use RUN_STORAGE_PATH) or
increasing page_size to a sufficiently large value, then assert against the
filtered result (e.g., check data.results[0] or find by id as before) rather
than relying on the default page. Ensure you update the arguments object passed
to client.callTool so the API returns the created storage path reliably when
checking state.storagePathId and RUN_STORAGE_PATH.

Comment thread src/tools/storagePaths.ts
Comment on lines +97 to +103
"update_storage_path",
"Update an existing storage path's name, path template, matching pattern, or matching algorithm.",
{
id: z.number(),
name: z.string(),
path: z
.string()
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Allow partial updates in update_storage_path.

update_storage_path currently requires name, which blocks valid PATCH-style updates (e.g., updating only path or only matching fields).

💡 Suggested change
-      name: z.string(),
+      name: z.string().optional(),

As per coding guidelines, "Prefer similar Zod schemas to the API when possible—keep tool inputs similar to the API rather than reassignment logic in the code" and "Include optional parameters with proper defaults in tool parameter schemas".

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
"update_storage_path",
"Update an existing storage path's name, path template, matching pattern, or matching algorithm.",
{
id: z.number(),
name: z.string(),
path: z
.string()
"update_storage_path",
"Update an existing storage path's name, path template, matching pattern, or matching algorithm.",
{
id: z.number(),
name: z.string().optional(),
path: z
.string()
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/tools/storagePaths.ts` around lines 97 - 103, The update_storage_path
tool schema currently requires fields like name and path which prevents
PATCH-style partial updates; modify the Zod input schema for update_storage_path
to make updatable fields optional (e.g., change name and path and any
matchingPattern/matchingAlgorithm fields in the same schema to
z.string().optional() or the appropriate optional Zod types) and ensure id
remains required, and provide sensible defaults/validation where needed so
callers can send partial updates without server-side reassignment logic.

Comment thread src/tools/storagePaths.ts
Comment on lines +189 to +199
return api.bulkEditObjects(
args.storage_path_ids,
"storage_paths",
args.operation,
args.operation === "set_permissions"
? {
owner: args.owner,
permissions: args.permissions,
merge: args.merge,
}
: {}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Normalize empty permission arrays/objects before bulk set-permissions.

Forwarding args.permissions directly can send empty arrays/objects, which may unintentionally apply empty permission sets instead of omitting unset values.

🔧 Suggested change
 import { withErrorHandling } from "./utils/middlewares";
 import { buildQueryString } from "./utils/queryString";
+import { arrayNotEmpty, objectNotEmpty } from "./utils/empty";
@@
-      return api.bulkEditObjects(
-        args.storage_path_ids,
-        "storage_paths",
-        args.operation,
-        args.operation === "set_permissions"
-          ? {
-              owner: args.owner,
-              permissions: args.permissions,
-              merge: args.merge,
-            }
-          : {}
-      );
+      const permissions =
+        args.permissions &&
+        objectNotEmpty({
+          view: objectNotEmpty({
+            users: arrayNotEmpty(args.permissions.view.users),
+            groups: arrayNotEmpty(args.permissions.view.groups),
+          }),
+          change: objectNotEmpty({
+            users: arrayNotEmpty(args.permissions.change.users),
+            groups: arrayNotEmpty(args.permissions.change.groups),
+          }),
+        });
+
+      const params =
+        args.operation === "set_permissions"
+          ? objectNotEmpty({
+              owner: args.owner,
+              permissions,
+              merge: args.merge,
+            }) ?? {}
+          : {};
+
+      return api.bulkEditObjects(
+        args.storage_path_ids,
+        "storage_paths",
+        args.operation,
+        params
+      );

As per coding guidelines, "Use arrayNotEmpty and objectNotEmpty transformation utilities from tools/utils/empty.ts to convert empty arrays and objects to undefined".

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
return api.bulkEditObjects(
args.storage_path_ids,
"storage_paths",
args.operation,
args.operation === "set_permissions"
? {
owner: args.owner,
permissions: args.permissions,
merge: args.merge,
}
: {}
const permissions =
args.permissions &&
objectNotEmpty({
view: objectNotEmpty({
users: arrayNotEmpty(args.permissions.view.users),
groups: arrayNotEmpty(args.permissions.view.groups),
}),
change: objectNotEmpty({
users: arrayNotEmpty(args.permissions.change.users),
groups: arrayNotEmpty(args.permissions.change.groups),
}),
});
const params =
args.operation === "set_permissions"
? objectNotEmpty({
owner: args.owner,
permissions,
merge: args.merge,
}) ?? {}
: {};
return api.bulkEditObjects(
args.storage_path_ids,
"storage_paths",
args.operation,
params
);
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/tools/storagePaths.ts` around lines 189 - 199, Normalize args.permissions
before passing to api.bulkEditObjects to avoid sending empty arrays/objects:
import and use the arrayNotEmpty and objectNotEmpty helpers from
tools/utils/empty.ts in storagePaths.ts, compute a normalizedPermissions value
(e.g., call arrayNotEmpty(args.permissions) and/or
objectNotEmpty(args.permissions) so empty collections become undefined) and
replace direct use of args.permissions in the payload for the "set_permissions"
branch of api.bulkEditObjects; keep owner and merge as-is but only include
permissions when normalizedPermissions is not undefined.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Missing tools: get_tag and full storage_paths CRUD

2 participants