Skip to content

fix(skills): remove broken document-writer skill, enhance document-editor#32151

Merged
alex-nork merged 3 commits into
mainfrom
alex/jarvis-961-skill-loads-but-calls-wrong-tool
May 26, 2026
Merged

fix(skills): remove broken document-writer skill, enhance document-editor#32151
alex-nork merged 3 commits into
mainfrom
alex/jarvis-961-skill-loads-but-calls-wrong-tool

Conversation

@alex-nork
Copy link
Copy Markdown
Contributor

Summary

  • Deletes the managed document-writer skill which had no TOOLS.json and a broken include ("document" instead of "document-editor"), causing skill_execute to fail with "Unknown tool: document_create" on staging
  • Folds useful anti-pattern guidance from document-writer into the bundled document-editor SKILL.md

Root Cause

The document-writer managed skill described document_create textually in its body but had no TOOLS.json manifest and referenced a nonexistent include ID. When the model loaded it and tried skill_execute(tool="document_create"), the tool was never registered in the registry.

Test plan

  • Deploy to staging and ask "write a blog post about AI" — should load document-editor and successfully call document_create
  • Verify locally that document creation still works end-to-end

Closes JARVIS-961

🤖 Generated with Claude Code

@alex-nork alex-nork requested a review from awlevin as a code owner May 26, 2026 19:00
@linear
Copy link
Copy Markdown

linear Bot commented May 26, 2026

JARVIS-961

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 3bf641e78a

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

@@ -1,199 +0,0 @@
---
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Remove stale catalog entry when deleting this skill

Deleting skills/document-writer/SKILL.md without also removing document-writer from skills/catalog.json leaves the first-party catalog advertising a skill that no longer exists locally. In dev/offline paths, resolveCatalog() can still return that local catalog entry, then autoInstallFromCatalog() tries to install it and installSkillLocally() cannot copy from repo (missing SKILL.md), forcing a remote fetch that fails when the platform entry is removed or unreachable. This means users can still hit a skill_load failure for document-writer even after this cleanup.

Useful? React with 👍 / 👎.

alex-nork and others added 3 commits May 26, 2026 15:13
…itor

The managed document-writer skill had no TOOLS.json and a broken include
("document" instead of "document-editor"), so skill_execute could never
find document_create in the registry — causing "Unknown tool" errors on
staging.

Delete document-writer and fold its useful anti-pattern guidance into the
bundled document-editor skill which already owns the TOOLS.json manifest.

Closes JARVIS-961

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Stale catalog entry would cause autoInstallFromCatalog to try
reinstalling the deleted skill.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@alex-nork alex-nork force-pushed the alex/jarvis-961-skill-loads-but-calls-wrong-tool branch from 044efef to 2334922 Compare May 26, 2026 19:14
@alex-nork alex-nork merged commit 9f48704 into main May 26, 2026
19 checks passed
@alex-nork alex-nork deleted the alex/jarvis-961-skill-loads-but-calls-wrong-tool branch May 26, 2026 19:23
noanflaherty added a commit that referenced this pull request May 26, 2026
* fix(macos/billing): tolerate tiered Pro plan in catalog decode (#32114)

PlanCatalogEntry.price_cents was a required Int, but the platform's
/v1/organizations/billing/plans/ Pro entry no longer emits price_cents
(it moved to base_price_cents + machine_tiers/storage_tiers). The whole
catalog decode hard-failed on the Pro entry, was swallowed by try? in
SettingsBillingTab.loadSummary, and surfaced as a perpetual 'Unable to
load plan information.' on the Plan card.

Make price_cents optional (PlanCard only reads id/name/included_features,
so no display impact) and update the wire-protocol test to the real
tiered Pro payload so this drift can't recur silently.

* fix(skills): remove broken document-writer skill, enhance document-editor (#32151)

* fix(skills): remove broken document-writer skill, enhance document-editor

The managed document-writer skill had no TOOLS.json and a broken include
("document" instead of "document-editor"), so skill_execute could never
find document_create in the registry — causing "Unknown tool" errors on
staging.

Delete document-writer and fold its useful anti-pattern guidance into the
bundled document-editor skill which already owns the TOOLS.json manifest.

Closes JARVIS-961

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix(skills): remove document-writer from catalog.json

Stale catalog entry would cause autoInstallFromCatalog to try
reinstalling the deleted skill.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* chore: regenerate catalog.json with correct meet-join timestamp

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix(assistant): include messageId on canned-response message_complete events (LUM-1902) (#32143)

* fix(assistant): include messageId on canned-greeting message_complete (LUM-1902)

The canned first-greeting path persisted the assistant row via addMessage()
but discarded the returned id, so the message_complete broadcast lacked
messageId. The macOS client filter at ChatActionHandler.swift:507 treats
message_complete events lacking messageId as aux-style notifications while
a turn is in flight and early-returns, so isSending never cleared and the
3-dot loading indicator stayed visible until the 60s watchdog kicked in.

Capture the persisted assistant row id and pass it through. Architectural
follow-up (other emission sites with the same bug shape, macOS filter
cleanup) tracked in LUM-1904.

* fix(assistant): apply messageId fix to all canned-response paths (LUM-1902)

Same bug shape as the canned greeting: the assistant row is persisted via
addMessage() but the returned id is discarded, so the message_complete
broadcast goes out without messageId and the macOS guard at
ChatActionHandler.swift:507 drops the event whenever the streaming-buffer
50ms flush has fired between the delta and the complete — leaving the user
stuck for the full 60s watchdog.

Patches the same five paths #31994 will eventually subsume:

  - inline approval reply (conversation-routes.ts:422)
  - canned first greeting (conversation-routes.ts:1451)
  - slash command output (:1774)
  - /compact (:1855)
  - /clean (:1935)

Centralized into a single `emitCannedMessageComplete` helper so the
temporary fix is one helper + five one-line callers, easy to grep and
inline-then-delete when #31994 lands. Helper carries the full LUM-1902
context comment so individual call sites stay tidy.

The wake-target adapter (wake-target-adapter.ts:130) has the same bug
shape but isn't a quick fix — AgentEvent.message_complete carries no
messageId at the point of relay, so it needs the pre-allocated anchor
treatment matching #31994's approach. Tracked in LUM-1904.

* chore(assistant): scrub internal ticket reference from helper comment

Linear ticket ids are internal references and don't help open-source
contributors reading this file. The PR reference (#31994) stays since
it's discoverable from the repo.

---------

Co-authored-by: Claude <noreply@anthropic.com>

* perf(macos): replace O(n²) conversation merge with O(1) dictionary lookup (#32095)

* perf(macos): replace O(n²) conversation merge with O(1) dictionary lookup

Replace linear-scan firstIndex(where:) with a pre-built [String: Int]
dictionary in handleConversationListResponse and appendConversations.

With ~1800 conversations (post-pagination PR #31924), the old O(n²)
pattern performed ~3.24M string comparisons on @mainactor, blocking the
main thread for ~1.6s and triggering AppHang reports (LUM-1901).

The dictionary reduces this to ~3600 hash lookups — effectively O(n).

Also removes dead code: a redundant snapshot.first(where:) that
searched for a conversation already proven absent by the preceding
firstIndex check.

Closes LUM-1901

Co-Authored-By: ashlee@vellum.ai <ashlee@vellum.ai>

* fix: keep dictionary in sync when appending new conversations

The old firstIndex(where:) searched the mutated snapshot (including
just-appended items), so duplicate IDs in a single response would match
and update in-place. The dictionary must be kept in sync after each
append to preserve this behavior.

Co-Authored-By: ashlee@vellum.ai <ashlee@vellum.ai>

---------

Co-authored-by: Devin AI <158243242+devin-ai-integration[bot]@users.noreply.github.com>

* fix(assistant): clear processing flag if /clean user persist fails (#32115)

Cherry-pick of a535818 — wraps the user-message persist inside the
outer try/finally so a throw from addMessage still clears processing
and drains the queue.

Co-authored-by: siddseethepalli <siddseethepalli@gmail.com>
Co-authored-by: Vellum Assistant <assistant@vellum.ai>

* fix(assistant): clear processing flag if /compact initial persist fails (#32128)

The /compact slash branch set conversation.processing = true and then
awaited the initial user-message addMessage OUTSIDE any guard. The
fire-and-forget compaction IIFE owns the try/finally that resets the
flag, but a throw from that initial persist (transient SQLite/disk
error) never reaches it, leaving the conversation stuck in queued mode
indefinitely. This is the same class of bug fixed for /clean in #32115.

An outer try/finally (as used by /clean) is wrong here because compact
returns 202 immediately and runs async, so it would clear the flag
before compaction finished. Instead, guard just the synchronous pre-202
persist: on failure reset processing, drain the queue, and rethrow so
the caller still surfaces the error.

Co-authored-by: Vellum Assistant <assistant@vellum.ai>

---------

Co-authored-by: Carson Shaar <carson.s.shaar@gmail.com>
Co-authored-by: Alex Nork <48630278+alex-nork@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: Ashlee Radka <ashlee@vellum.ai>
Co-authored-by: Devin AI <158243242+devin-ai-integration[bot]@users.noreply.github.com>
Co-authored-by: siddseethepalli <siddseethepalli@gmail.com>
Co-authored-by: Vellum Assistant <assistant@vellum.ai>
vellum-automation Bot added a commit that referenced this pull request May 26, 2026
* Release v0.8.5

* Cherry-pick fixes onto release/v0.8.5 (#32159)

* fix(macos/billing): tolerate tiered Pro plan in catalog decode (#32114)

PlanCatalogEntry.price_cents was a required Int, but the platform's
/v1/organizations/billing/plans/ Pro entry no longer emits price_cents
(it moved to base_price_cents + machine_tiers/storage_tiers). The whole
catalog decode hard-failed on the Pro entry, was swallowed by try? in
SettingsBillingTab.loadSummary, and surfaced as a perpetual 'Unable to
load plan information.' on the Plan card.

Make price_cents optional (PlanCard only reads id/name/included_features,
so no display impact) and update the wire-protocol test to the real
tiered Pro payload so this drift can't recur silently.

* fix(skills): remove broken document-writer skill, enhance document-editor (#32151)

* fix(skills): remove broken document-writer skill, enhance document-editor

The managed document-writer skill had no TOOLS.json and a broken include
("document" instead of "document-editor"), so skill_execute could never
find document_create in the registry — causing "Unknown tool" errors on
staging.

Delete document-writer and fold its useful anti-pattern guidance into the
bundled document-editor skill which already owns the TOOLS.json manifest.

Closes JARVIS-961

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix(skills): remove document-writer from catalog.json

Stale catalog entry would cause autoInstallFromCatalog to try
reinstalling the deleted skill.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* chore: regenerate catalog.json with correct meet-join timestamp

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix(assistant): include messageId on canned-response message_complete events (LUM-1902) (#32143)

* fix(assistant): include messageId on canned-greeting message_complete (LUM-1902)

The canned first-greeting path persisted the assistant row via addMessage()
but discarded the returned id, so the message_complete broadcast lacked
messageId. The macOS client filter at ChatActionHandler.swift:507 treats
message_complete events lacking messageId as aux-style notifications while
a turn is in flight and early-returns, so isSending never cleared and the
3-dot loading indicator stayed visible until the 60s watchdog kicked in.

Capture the persisted assistant row id and pass it through. Architectural
follow-up (other emission sites with the same bug shape, macOS filter
cleanup) tracked in LUM-1904.

* fix(assistant): apply messageId fix to all canned-response paths (LUM-1902)

Same bug shape as the canned greeting: the assistant row is persisted via
addMessage() but the returned id is discarded, so the message_complete
broadcast goes out without messageId and the macOS guard at
ChatActionHandler.swift:507 drops the event whenever the streaming-buffer
50ms flush has fired between the delta and the complete — leaving the user
stuck for the full 60s watchdog.

Patches the same five paths #31994 will eventually subsume:

  - inline approval reply (conversation-routes.ts:422)
  - canned first greeting (conversation-routes.ts:1451)
  - slash command output (:1774)
  - /compact (:1855)
  - /clean (:1935)

Centralized into a single `emitCannedMessageComplete` helper so the
temporary fix is one helper + five one-line callers, easy to grep and
inline-then-delete when #31994 lands. Helper carries the full LUM-1902
context comment so individual call sites stay tidy.

The wake-target adapter (wake-target-adapter.ts:130) has the same bug
shape but isn't a quick fix — AgentEvent.message_complete carries no
messageId at the point of relay, so it needs the pre-allocated anchor
treatment matching #31994's approach. Tracked in LUM-1904.

* chore(assistant): scrub internal ticket reference from helper comment

Linear ticket ids are internal references and don't help open-source
contributors reading this file. The PR reference (#31994) stays since
it's discoverable from the repo.

---------

Co-authored-by: Claude <noreply@anthropic.com>

* perf(macos): replace O(n²) conversation merge with O(1) dictionary lookup (#32095)

* perf(macos): replace O(n²) conversation merge with O(1) dictionary lookup

Replace linear-scan firstIndex(where:) with a pre-built [String: Int]
dictionary in handleConversationListResponse and appendConversations.

With ~1800 conversations (post-pagination PR #31924), the old O(n²)
pattern performed ~3.24M string comparisons on @mainactor, blocking the
main thread for ~1.6s and triggering AppHang reports (LUM-1901).

The dictionary reduces this to ~3600 hash lookups — effectively O(n).

Also removes dead code: a redundant snapshot.first(where:) that
searched for a conversation already proven absent by the preceding
firstIndex check.

Closes LUM-1901

Co-Authored-By: ashlee@vellum.ai <ashlee@vellum.ai>

* fix: keep dictionary in sync when appending new conversations

The old firstIndex(where:) searched the mutated snapshot (including
just-appended items), so duplicate IDs in a single response would match
and update in-place. The dictionary must be kept in sync after each
append to preserve this behavior.

Co-Authored-By: ashlee@vellum.ai <ashlee@vellum.ai>

---------

Co-authored-by: Devin AI <158243242+devin-ai-integration[bot]@users.noreply.github.com>

* fix(assistant): clear processing flag if /clean user persist fails (#32115)

Cherry-pick of a535818 — wraps the user-message persist inside the
outer try/finally so a throw from addMessage still clears processing
and drains the queue.

Co-authored-by: siddseethepalli <siddseethepalli@gmail.com>
Co-authored-by: Vellum Assistant <assistant@vellum.ai>

* fix(assistant): clear processing flag if /compact initial persist fails (#32128)

The /compact slash branch set conversation.processing = true and then
awaited the initial user-message addMessage OUTSIDE any guard. The
fire-and-forget compaction IIFE owns the try/finally that resets the
flag, but a throw from that initial persist (transient SQLite/disk
error) never reaches it, leaving the conversation stuck in queued mode
indefinitely. This is the same class of bug fixed for /clean in #32115.

An outer try/finally (as used by /clean) is wrong here because compact
returns 202 immediately and runs async, so it would clear the flag
before compaction finished. Instead, guard just the synchronous pre-202
persist: on failure reset processing, drain the queue, and rethrow so
the caller still surfaces the error.

Co-authored-by: Vellum Assistant <assistant@vellum.ai>

---------

Co-authored-by: Carson Shaar <carson.s.shaar@gmail.com>
Co-authored-by: Alex Nork <48630278+alex-nork@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: Ashlee Radka <ashlee@vellum.ai>
Co-authored-by: Devin AI <158243242+devin-ai-integration[bot]@users.noreply.github.com>
Co-authored-by: siddseethepalli <siddseethepalli@gmail.com>
Co-authored-by: Vellum Assistant <assistant@vellum.ai>

---------

Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: Noa Flaherty <noa@vellum.ai>
Co-authored-by: Carson Shaar <carson.s.shaar@gmail.com>
Co-authored-by: Alex Nork <48630278+alex-nork@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: Ashlee Radka <ashlee@vellum.ai>
Co-authored-by: Devin AI <158243242+devin-ai-integration[bot]@users.noreply.github.com>
Co-authored-by: siddseethepalli <siddseethepalli@gmail.com>
Co-authored-by: Vellum Assistant <assistant@vellum.ai>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant