Skip to content

Make split mode graph work with vision enabled#1392

Merged
ikawrakow merged 1 commit intomainfrom
ik/fix_sm_graph_with_vision
Mar 10, 2026
Merged

Make split mode graph work with vision enabled#1392
ikawrakow merged 1 commit intomainfrom
ik/fix_sm_graph_with_vision

Conversation

@ikawrakow
Copy link
Owner

Closes #1353

@ubergarm
Copy link
Contributor

It pulled tip of main this morning, but now with opencode client or through the built in web-ui I'm getting gibberish inference output (from tip of main@cda15bf1):

ik-gibberish-main

from main@14492bfd (this PR):

ik-gibberish

I tested with and without --mmproj but was the same.

I rolled back and tested each recent commit until it worked again at 666ea0e:

$ git log --oneline
cda15bf1 (HEAD -> main, upstream/main, upstream/HEAD) Discard very first compute graph for recurrent models (#1393)
f90b4c2f Full graph parallel for Qwen3.5 (dense and MoE) (#1388)
14492bfd Make split mode graph work with vision enabled (#1392) <--- first broken here
666ea0e9 Revise build instructions for ik_llama.cpp <--- working

Here is my command:

# full offload on 2x RTX A6000 48GB VRAM each
./build/bin/llama-server \
  --alias Qwen3.5-122B-A10B \
  --model "$model" \
  -fa on \
  -c 262144 \
  -sm graph \
  -ngl 99 \
  -ub 4096 -b 4096 \
  --parallel 1 \
  --threads 1 \
  --host 127.0.0.1 \
  --port 8080 \
  --jinja \
  --no-mmap

Not sure if anyone else is seeing this? I'll keep testing to see if I can narrow it down any more.

@ubergarm
Copy link
Contributor

ubergarm commented Mar 10, 2026

Removing --no-mmap fixes the issue as discussed in the linked PR ☝️

I tested and --mmproj is working and reading images correctly!

@ikawrakow ikawrakow mentioned this pull request Mar 10, 2026
@ikawrakow
Copy link
Owner Author

@ubergarm

#1397 should fix the --no-mmap issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Bug: Qwen3.5 multimodal not working with --split-mode graph

2 participants