Skip to content

Ollama: Roo overrides Modelfile num_ctx by forcing options.num_ctx from modelInfo.contextWindow #7797

@hannesrudolph

Description

@hannesrudolph

Ollama: Roo overrides Modelfile num_ctx by forcing options.num_ctx from modelInfo.contextWindow

App Version
v3.27.0 (main 0ce4e89)

API Provider
Ollama

Model Used
Any Ollama model with a Modelfile num_ctx smaller than Roo’s inferred context (e.g., llama3.1 with num_ctx 4096)

Steps to Reproduce

  1. Create an Ollama model with a reduced context:
    Modelfile:
    FROM llama3.1
    PARAMETER num_ctx 4096
    
    Build and start: ollama create my-small-ctx -f Modelfile then ensure Ollama is running.
  2. In Roo Code, set provider to “Ollama” and select model my-small-ctx.
  3. Start a chat and send a prompt large enough to exceed the Modelfile’s num_ctx (e.g., >4k tokens).
  4. Observe GPU/CPU memory usage and/or server behavior.

Outcome Summary
Roo sends options.num_ctx equal to its inferred modelInfo.contextWindow, overriding the Modelfile’s num_ctx. This can allocate a much larger context window than intended (e.g., 128k/200k), causing failures or OOM on budget GPUs.

Technical Evidence

Expected Behavior

  • Roo should not force num_ctx unless the user explicitly configures it.
  • By default, Ollama should honor the Modelfile’s num_ctx or server-side defaults.

Actual Behavior

  • Roo overrides the Modelfile by sending options.num_ctx = modelInfo.contextWindow on each chat request, ignoring user-defined num_ctx in the Modelfile.

Proposed Fix (non‑breaking)

  • Remove num_ctx from default chat options in NativeOllamaHandler.createMessage() so Ollama respects the Modelfile.
  • Add an explicit optional override (e.g., ApiHandlerOptions.ollamaNumCtx?: number) in ApiHandlerOptions; include num_ctx only when set.
  • Optionally improve display-only inference in parseOllamaModel() to prefer a configured num_ctx (e.g., from parameters or model_info.num_ctx), but do not enforce it in requests.

Notes

  • Matches reports of unexpected large context allocation on consumer GPUs.
  • Opt‑in num_ctx keeps power users’ control while respecting defaults for everyone else.

Related

Metadata

Metadata

Assignees

Labels

Issue - In ProgressSomeone is actively working on this. Should link to a PR soon.bugSomething isn't working

Type

No type

Projects

Status

Done

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions