-
Notifications
You must be signed in to change notification settings - Fork 93
memory: route narrative/pattern/summarization/starters through call-site IDs #26107
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -189,7 +189,7 @@ async function summarizeWithLLM( | |
| systemPrompt, | ||
| { | ||
| config: { | ||
| modelIntent: summarizationConfig.modelIntent, | ||
| callSite: "conversationSummarization" as const, | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
This call now opts into Useful? React with 👍 / 👎.
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 🚩 memory.summarization.modelIntent config field becomes inert after this change Before this PR, Was this helpful? React with 👍 or 👎 to provide feedback. |
||
| max_tokens: SUMMARY_MAX_TOKENS, | ||
| }, | ||
| signal, | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🚩 Behavioral change: callSite resolution injects thinking/effort/speed parameters not previously set
The old
modelIntentpath inproviders/retry.ts:91-154only resolved themodelfield — it did not injectthinking,effort,speed,temperature, ormax_tokens. The newcallSitepath (providers/retry.ts:173-240) resolves ALL these fields fromllm.default(or the call-site override). For narrative refinement and pattern scan, which previously had nomax_tokens,thinking, oreffortin their config objects, this meansthinking.enabled: true,effort: "max",max_tokens: 64000, andspeed: "standard"are now injected fromllm.default. This could increase cost and latency for these background jobs. Whether this is desirable depends on whether these jobs benefit from extended thinking — they are background pattern-detection tasks where thinking may be helpful but the cost increase should be evaluated.Was this helpful? React with 👍 or 👎 to provide feedback.