forked from cline/cline
-
Couldn't load subscription status.
- Fork 2.4k
Closed
Labels
Issue - In ProgressSomeone is actively working on this. Should link to a PR soon.Someone is actively working on this. Should link to a PR soon.bugSomething isn't workingSomething isn't working
Description
Ollama: Roo overrides Modelfile num_ctx by forcing options.num_ctx from modelInfo.contextWindow
App Version
v3.27.0 (main 0ce4e89)
API Provider
Ollama
Model Used
Any Ollama model with a Modelfile num_ctx smaller than Roo’s inferred context (e.g., llama3.1 with num_ctx 4096)
Steps to Reproduce
- Create an Ollama model with a reduced context:
Modelfile:Build and start:FROM llama3.1 PARAMETER num_ctx 4096ollama create my-small-ctx -f Modelfilethen ensure Ollama is running. - In Roo Code, set provider to “Ollama” and select model my-small-ctx.
- Start a chat and send a prompt large enough to exceed the Modelfile’s num_ctx (e.g., >4k tokens).
- Observe GPU/CPU memory usage and/or server behavior.
Outcome Summary
Roo sends options.num_ctx equal to its inferred modelInfo.contextWindow, overriding the Modelfile’s num_ctx. This can allocate a much larger context window than intended (e.g., 128k/200k), causing failures or OOM on budget GPUs.
Technical Evidence
- Roo sets num_ctx on chat requests in native Ollama handler:
- Request creation at src/api/providers/native-ollama.ts:186
- num_ctx assignment at src/api/providers/native-ollama.ts:193
- modelInfo.contextWindow is inferred from Ollama /api/show via parseOllamaModel()
- contextWindow assignment at src/api/providers/fetchers/ollama.ts:47
- maxTokens mirror at src/api/providers/fetchers/ollama.ts:51
- Default fallback is ollamaDefaultModelInfo with contextWindow 200,000
- Non‑streaming path completePrompt() does not include num_ctx
- Provider selection routes “ollama” → NativeOllamaHandler
Expected Behavior
- Roo should not force num_ctx unless the user explicitly configures it.
- By default, Ollama should honor the Modelfile’s num_ctx or server-side defaults.
Actual Behavior
- Roo overrides the Modelfile by sending options.num_ctx = modelInfo.contextWindow on each chat request, ignoring user-defined num_ctx in the Modelfile.
Proposed Fix (non‑breaking)
- Remove num_ctx from default chat options in NativeOllamaHandler.createMessage() so Ollama respects the Modelfile.
- Add an explicit optional override (e.g., ApiHandlerOptions.ollamaNumCtx?: number) in ApiHandlerOptions; include num_ctx only when set.
- Optionally improve display-only inference in parseOllamaModel() to prefer a configured num_ctx (e.g., from parameters or model_info.num_ctx), but do not enforce it in requests.
Notes
- Matches reports of unexpected large context allocation on consumer GPUs.
- Opt‑in num_ctx keeps power users’ control while respecting defaults for everyone else.
Related
Metadata
Metadata
Assignees
Labels
Issue - In ProgressSomeone is actively working on this. Should link to a PR soon.Someone is actively working on this. Should link to a PR soon.bugSomething isn't workingSomething isn't working
Type
Projects
Status
Done