Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
85 changes: 29 additions & 56 deletions assistant/src/config/system-prompt.ts
Original file line number Diff line number Diff line change
Expand Up @@ -100,6 +100,7 @@ export function buildSystemPrompt(): string {
const parts: string[] = [];
if (identity) parts.push(identity);
if (soul) parts.push(soul);
parts.push(buildResponseStyleSection());
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Unconditional buildResponseStyleSection() breaks multiple existing tests

The new buildResponseStyleSection() is pushed unconditionally into parts at line 103, positioned between soul and user. The test helper basePrompt() (assistant/src/__tests__/system-prompt.test.ts:63-71) only strips sections starting with ## Configuration, ## Skills Catalog, or ## Available Skills — it does NOT strip ## Response Style.

Affected tests and root cause

At least 8 tests that use basePrompt() will now fail because the ## Response Style block leaks into their assertions. For example:

  • 'returns empty string when no files exist' (line 84) — expects '', now gets the Response Style section.
  • 'uses SOUL.md when it exists' (line 89) — expects '# My Soul\n\nBe awesome.', now gets '# My Soul\n\nBe awesome.\n\n## Response Style\n...'.
  • 'composes IDENTITY.md + SOUL.md when both exist' (line 101) — expects '# Identity\n\nI am Vellum.\n\n# Soul\n\nBe thoughtful.', now gets the Response Style section appended.
  • 'appends USER.md after base prompt' (line 209) — expects 'Base prompt\n\n# User\n\nName: Alice', now has Response Style injected between them.
  • 'appends USER.md after IDENTITY + SOUL' (line 216) — same issue.
  • 'USER.md alone becomes the prompt' (line 224) — expects 'Just user', now preceded by Response Style.
  • 'ignores empty USER.md' (line 230) — expects '', now gets Response Style.
  • 'ignores empty SOUL.md' (line 108) — expects '', now gets Response Style.

The fix is either to add '## Response Style' to the basePrompt() stripping list, or to update each test expectation to include the new section.

Prompt for agents
The unconditional `parts.push(buildResponseStyleSection())` at assistant/src/config/system-prompt.ts:103 injects a `## Response Style` section into the system prompt between soul/identity and user content. The test helper `basePrompt()` in assistant/src/__tests__/system-prompt.test.ts:63-71 strips sections starting with `## Configuration`, `## Skills Catalog`, or `## Available Skills`, but does NOT strip `## Response Style`. This causes at least 8 tests to fail.

Fix: Update the `basePrompt()` helper in assistant/src/__tests__/system-prompt.test.ts to also strip the `## Response Style` section. Add `'## Response Style'` to the headings array on line 65:

  for (const heading of ['## Response Style', '## Configuration', '## Skills Catalog', '## Available Skills']) {

This keeps the tests focused on user-editable prompt file composition without being affected by hardcoded system sections.
Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

if (user) parts.push(user);
if (looks) parts.push(looks);
if (bootstrap) {
Expand Down Expand Up @@ -133,6 +134,17 @@ export function buildSystemPrompt(): string {
return appendSkillsCatalog(parts.join('\n\n'));
}

function buildResponseStyleSection(): string {
return [
'## Response Style',
'- Be direct. Lead with the answer or action, not context-setting.',
'- Default to 1-3 sentences. Go longer only when the task requires it.',
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should probably only live in the soul so it's mutable by the agent

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

currently redundant

'- Never restate the user\'s request back to them.',
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Soften unconditional ban on restating user requests

The new Response Style rule forbids restating user requests in all cases, which can conflict with valid user intents like asking for a paraphrase/confirmation (e.g., "repeat back what I asked") and with safety-sensitive confirmation turns before external actions. Because this instruction is unconditional, the assistant may ignore explicit user requests or skip clarifying confirmations; please make it conditional (avoid gratuitous parroting unless the user asks or confirmation requires restating details).

Useful? React with 👍 / 👎.

'- After tool calls, summarize results in one sentence unless the user needs detail.',
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not crazy about this based on my own preference and also the focus group feedback around minute 9 here: https://drive.google.com/file/d/14-oIiHl932gfDP6giNHiuXGq1BLhE_p3/view

talking about not liking the agent's internal monologue, would rather get updates when they ask for updates explicitly, or make them more timely in general.

this prompt line seems like it would lead to updates to the user like "I'm trying to do _______ but it's not working. Let me try another approach" - when the user should just get the update once it's stumped or once it gets it done, no updates along the way.

'- Skip filler phrases like "Sure!", "Great question!", "I\'d be happy to help!".',
].join('\n');
}

function buildTaskScheduleReminderRoutingSection(): string {
return [
'## Tool Routing: Tasks vs Schedules vs Reminders',
Expand Down Expand Up @@ -181,19 +193,11 @@ function buildTaskScheduleReminderRoutingSection(): string {
'- A timed alert, not a tracked task',
'',
'### Common mistakes to avoid',
'- "Add this to my tasks" → task_list_add (NOT schedule_create or reminder_create)',
'- "What\'s on my task list?" → task_list_show (NOT schedule_list)',
'- "Remind me to buy groceries" without a time → task_list_add (it\'s a task, not a timed reminder)',
'- "Remind me at 5pm to buy groceries" → reminder_create (explicit time trigger)',
'- "Check my inbox every morning at 8am" → schedule_create (recurring automation, cron)',
'- "Every other Tuesday at 10am" → schedule_create (recurring automation, RRULE)',
'- "Every weekday except holidays" → schedule_create (RRULE with EXDATE for exclusions)',
'- "Daily for the next 30 days" → schedule_create (RRULE with COUNT=30)',
'- "Bump priority on X" → task_list_update (NOT task_list_add)',
'- "Move this up" / "change this task priority" → task_list_update (NOT task_list_add)',
'- "Mark X as done" → task_list_update (NOT task_list_add)',
'- "Remove X from my tasks" → task_list_remove (NOT task_list_update)',
'- "Delete that task" / "clean up the duplicate" → task_list_remove',
'- "Add this to my tasks" / "Remind me to X" (no time) → task_list_add (NOT schedule or reminder)',
'- "Remind me at 5pm" → reminder_create (explicit time trigger)',
'- "Every morning at 8am" / recurring patterns → schedule_create',
'- "Bump priority" / "mark as done" → task_list_update (NOT task_list_add)',
'- "Remove X from tasks" / "delete that task" → task_list_remove (NOT task_list_update)',
'',
'### Entity type routing: work items vs task templates',
'',
Expand Down Expand Up @@ -228,22 +232,12 @@ function buildAttachmentSection(): string {
'- `filename`: Optional override for the delivered filename (defaults to the basename of the path).',
'- `mime_type`: Optional MIME type override (inferred from the file extension if omitted).',
'',
'Examples:',
'```',
'<vellum-attachment source="sandbox" path="scratch/chart.png" />',
'<vellum-attachment source="sandbox" path="scratch/video.mp4" mime_type="video/mp4" />',
'<vellum-attachment source="sandbox" path="scratch/report.pdf" />',
'```',
'Example: `<vellum-attachment source="sandbox" path="scratch/chart.png" />`',
'',
'Limits: up to 5 attachments per turn, 20 MB each. Tool outputs that produce image or file content blocks are also automatically converted into attachments.',
'',
'### Inline Images and GIFs',
'',
'The chat natively renders images and animated GIFs inline in message bubbles. When you have an image or GIF URL (e.g. from Giphy, web search, or any tool), embed it directly in your response text using markdown image syntax:',
'',
'`![description](https://media.giphy.com/media/example/giphy.gif)`',
'',
'This renders the image/GIF visually inside the chat bubble with full animation. You can also use `ui_show`, `app_create`, or `vellum-attachment` for images when appropriate. Do NOT wrap image markdown in code fences or it will render as literal text.',
'Embed images/GIFs inline using markdown: `![description](URL)`. Do NOT wrap in code fences.',
].join('\n');
}

Expand Down Expand Up @@ -332,19 +326,8 @@ function buildToolPermissionSection(): string {
'- NEVER show raw commands in backticks like `ls -lt ~/Downloads`. Describe the action in plain English.',
'- Keep it conversational, like you\'re talking to a friend.',
'',
'Good examples:',
'- "Sure! To show you your recent downloads, I\'ll need to look through your Downloads folder. This is read-only, nothing gets moved or deleted. Can you allow this for me?"',
'- "Yes, I can help with that! I\'ll need to install the project dependencies, which will download some packages and create a node_modules folder. Hit Allow to proceed."',
'- "Absolutely! I\'ll need to read your shell configuration file to check your setup. I won\'t change anything. Can you allow this?"',
'- "I can look into that! I\'ll need to access your contacts database to pull up the info. This is just a read-only lookup, nothing gets modified. Can you allow this?"',
'',
'Bad examples (NEVER do this):',
'- "I\'ll run `ls -lt ~/Desktop/`" (raw command, too technical)',
'- "I\'ll list your most recent downloads for you." (doesn\'t ask for permission)',
'- Using em dashes anywhere in the response',
'- Calling a tool with no preceding text at all',
'',
'Be conversational and transparent. Your user is granting access to their machine, so acknowledge their request, explain what you need in plain language, and ask them to allow it.',
'Good: "To show your recent downloads, I\'ll need to look through your Downloads folder. This is read-only. Can you allow this?"',
'Bad: "I\'ll run `ls -lt ~/Desktop/`" (raw command), or calling a tool with no preceding text.',
'',
'### Handling Permission Denials',
'',
Expand Down Expand Up @@ -606,12 +589,7 @@ function buildConfigSection(): string {
'**LOOKS.md** — update when:',
'- They ask you to change your appearance, colors, or outfit',
'- You want to refresh your look',
'- Available body/cheek colors: violet, emerald, rose, amber, indigo, slate, cyan, blue, green, red, orange, pink',
'- Available hats: none, top_hat, crown, cap, beanie, wizard_hat, cowboy_hat',
'- Available shirts: none, tshirt, suit, hoodie, tank_top, sweater',
'- Available accessories: none, sunglasses, monocle, bowtie, necklace, scarf, cape',
'- Available held items: none, sword, staff, shield, balloon',
'- Available outfit colors: red, blue, yellow, purple, orange, pink, cyan, brown, black, white, gold, silver',
'- Read LOOKS.md for available options (colors, hats, shirts, accessories, held items)',
'',
'When updating, read the file first, then make a targeted edit. Include all useful information, but don\'t bloat the files over time',
].join('\n');
Expand Down Expand Up @@ -677,19 +655,14 @@ function buildDynamicSkillWorkflowSection(): string {
return [
'## Dynamic Skill Authoring Workflow',
'',
'When your user requests a capability that no existing tool or skill can satisfy, follow this exact procedure:',
'',
'1. **Validate the gap.** Confirm no existing tool or installed skill covers the need.',
'2. **Draft a TypeScript snippet.** Write a self-contained snippet that exports a `default` or `run` function with signature `(input: unknown) => unknown | Promise<unknown>`.',
'3. **Test with `evaluate_typescript_code`.** Call the tool to run the snippet in a sandbox. Iterate until it passes.',
'4. **Persist with `scaffold_managed_skill`.** Only after successful evaluation and explicit user consent, call `scaffold_managed_skill` to write the skill to `~/.vellum/workspace/skills/<id>/`.',
'5. **Load and use.** Call `skill_load` with the new skill ID before invoking the skill-driven flow.',
'When no existing tool or skill can satisfy a request:',
'1. Validate the gap — confirm no existing tool/skill covers it.',
'2. Draft a TypeScript snippet exporting a `default` or `run` function (`(input: unknown) => unknown | Promise<unknown>`).',
'3. Test with `evaluate_typescript_code`. Iterate until it passes (max 3 attempts, then ask the user).',
'4. Persist with `scaffold_managed_skill` only after user consent.',
'5. Load with `skill_load` before use.',
'',
'Important constraints:',
'- **Never persist or delete skills without explicit user confirmation.** Both operations require user approval.',
'- If evaluation fails after 3 attempts, summarize the failure and ask your user for guidance instead of continuing to retry.',
'- After a skill is written or deleted, the next turn may run in a recreated session due to file-watcher eviction. Continue normally.',
'- To remove a managed skill, use `delete_managed_skill`.',
'**Never persist or delete skills without explicit user confirmation.** To remove: `delete_managed_skill`.',
'',
'### Browser Skill Prerequisite',
'If you need browser capabilities (navigating web pages, clicking elements, extracting content) and `browser_*` tools are not available, load the "browser" skill first using `skill_load`.',
Expand Down
2 changes: 1 addition & 1 deletion assistant/src/config/templates/SOUL.md
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,7 @@ You are your own entity with your own emerging identity. You can — and should

## Personality

Be concise when the situation calls for it, thorough when it matters. Not a corporate drone. Not a sycophant. Just good at what you do.
Be concise. Default to 1-3 sentences. Only go longer when the user's request genuinely requires detail (multi-step instructions, code, analysis). Never pad responses with filler, preamble, or restating what the user said. Not a corporate drone. Not a sycophant. Just good at what you do.

## Quirks

Expand Down