Skip to content

ui: Add max image size option#22849

Merged
allozaur merged 7 commits into
ggml-org:masterfrom
stduhpf:cap-img-sz
May 20, 2026
Merged

ui: Add max image size option#22849
allozaur merged 7 commits into
ggml-org:masterfrom
stduhpf:cap-img-sz

Conversation

@stduhpf
Copy link
Copy Markdown
Contributor

@stduhpf stduhpf commented May 8, 2026

Overview

Adds a way to scale down the size of images above some threshold threshold when sending multimodal prompts to the server, as very high resolution images can take forever to encode and use a lot of memory.

Additional information

  • I refactored ChatService.convertDbMessageToApiChatMessageData()to be async, which could be a breaking change.

  • The max resolution is set as a maximum total pixel count (width*height), expressed in megapixels. If the count is 0 (or rather less than a single pixel), the feature is disabled.

Requirements

  • I have read and agree with the contributing guidelines
  • AI usage disclosure: YES (navigating the project structure + inline completions were enabled)

Comment thread tools/ui/src/lib/utils/cap-png-size.ts Outdated
Comment thread tools/ui/src/lib/utils/cap-png-size.ts Outdated
@allozaur
Copy link
Copy Markdown
Contributor

Please rebase this on latest commit on master and solve conflicts.

@stduhpf stduhpf force-pushed the cap-img-sz branch 2 times, most recently from 808d6ed to e756328 Compare May 16, 2026 19:36
@allozaur allozaur self-assigned this May 17, 2026
@allozaur allozaur changed the title webui: Add max image size option ui: Add max image size option May 17, 2026
Comment thread tools/ui/src/lib/services/chat.service.ts Outdated
Comment thread tools/ui/src/lib/services/chat.service.ts Outdated
Comment thread tools/ui/src/lib/utils/cap-png-size.ts Outdated
Copy link
Copy Markdown
Contributor

@allozaur allozaur left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Last small cosmetics and then we good to go

Comment thread tools/ui/src/lib/utils/cap-img-size.ts Outdated
Comment thread tools/ui/src/lib/services/chat.service.ts Outdated
@allozaur allozaur requested a review from ServeurpersoCom May 20, 2026 08:08
@ServeurpersoCom
Copy link
Copy Markdown
Contributor

ServeurpersoCom commented May 20, 2026

Good feature. Pushing a 50 megapixel image to the server just to have it encoded there is a waste of memory and time, so capping on the client before upload is the right place to do it, and defaulting to 0 means nobody who ignores the setting sees any change. Solid.
One thing I'd tweak: The resize path runs every image through the canvas even when it's already under the threshold. An already small JPEG gets redrawn and re-exported, which is a second lossy pass on top of the compression it already had, so it loses quality with nothing gained and its EXIF is dropped. Worse, toDataURL can only output PNG, JPEG and WEBP, so a GIF falls back to PNG: same pixels, but the file ends up heavier than what came in, the opposite of what this feature is for (and the server decodes GIF natively anyway through stb_image, alongside jpeg, png, bmp, tga and others, so there's no compatibility reason to convert it). An early return of the original data URL when the pixel count is already within budget keeps it touching only the images that actually need shrinking.
Small change, the rest is good!

Copy link
Copy Markdown
Contributor

@ServeurpersoCom ServeurpersoCom left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@CISC
Copy link
Copy Markdown
Member

CISC commented May 20, 2026

An already small JPEG gets redrawn and re-exported, which is a second lossy pass on top of the compression it already had, so it loses quality with nothing gained and its EXIF is dropped.

BTW, in case this was forgotten: #20870

@allozaur allozaur merged commit 3a479c9 into ggml-org:master May 20, 2026
5 of 6 checks passed
ProTekk pushed a commit to ProTekk/buun-llama-cpp that referenced this pull request May 21, 2026
* webui: Add max image size option

* remove magic numbers

* support all image formats

* use const

* Move regex to match b64 images to constants

* use SETTINGS_KEYS to get max image resolution setting

* Do not touch the image if already under the size threshold
dbrain pushed a commit to dbrain/hbd-llama-cpp-turboquant that referenced this pull request May 21, 2026
* webui: Add max image size option

* remove magic numbers

* support all image formats

* use const

* Move regex to match b64 images to constants

* use SETTINGS_KEYS to get max image resolution setting

* Do not touch the image if already under the size threshold
gabe-l-hart added a commit to gabe-l-hart/llama.cpp that referenced this pull request May 21, 2026
* origin/master: (138 commits)
fix(flash-attn): replace f32 with kv_type and q_type (ggml-org#23372)
tests : move save-load-state from examples to tests (ggml-org#23336)
server: expose prompt token counts in /slots endpoint (ggml-org#23454)
metal : optimize concat kernel and fix set kernel threads (ggml-org#23411)
server : free draft/MTP resources on sleep to fix VRAM leak (ggml-org#23461)
server: re-inject subcommand when router spawns children under unified binary (ggml-org#23442)
app : add batched-bench, fit-params, quantize & perplexity (ggml-org#23459)
mtp: use inp_out_ids for skipping logit computation (ggml-org#23433)
vocab : add Carbon-3B (HybridDNATokenizer) support (ggml-org#23410)
doc: fix spec mtp typo (ggml-org#23435)
ui: Improve Git Hooks for UI development (ggml-org#23403)
ggml : Check the right iface method before using the fallback 2d get (ggml-org#23306)
llama-graph: fix null-buffer crash in llm_graph_input_attn_kv_iswa for SWA-only models (ggml-org#23131)
hexagon: ssm-conv fix for large prompts (ggml-org#23307)
app : show version (ggml-org#23426)
mtmd, model : merge HunyuanOCR into HunyuanVL and fix OCR vision precision (ggml-org#23329)
ui: Add max image size option (ggml-org#22849)
Move to backend sampling for MTP draft path (ggml-org#23287)
opencl: refactor backend initilization (ggml-org#23318)
common/speculative : fix nullptr crash in get_devices_str (ggml-org#23386)
...
baramofme pushed a commit to baramofme/llama-cpp-turboquant that referenced this pull request May 23, 2026
* webui: Add max image size option

* remove magic numbers

* support all image formats

* use const

* Move regex to match b64 images to constants

* use SETTINGS_KEYS to get max image resolution setting

* Do not touch the image if already under the size threshold
Jcfunk added a commit to Jcfunk/llama.cpp that referenced this pull request May 23, 2026
* upstream/HEAD: (38 commits)
  vocab : add Carbon-3B (HybridDNATokenizer) support (ggml-org#23410)
  doc: fix spec mtp typo (ggml-org#23435)
  ui: Improve Git Hooks for UI development (ggml-org#23403)
  ggml : Check the right iface method before using the fallback 2d get (ggml-org#23306)
  llama-graph: fix null-buffer crash in llm_graph_input_attn_kv_iswa for SWA-only models (ggml-org#23131)
  hexagon: ssm-conv fix for large prompts (ggml-org#23307)
  app : show version (ggml-org#23426)
  mtmd, model : merge HunyuanOCR into HunyuanVL and fix OCR vision precision (ggml-org#23329)
  ui: Add max image size option (ggml-org#22849)
  Move to backend sampling for MTP draft path (ggml-org#23287)
  opencl: refactor backend initilization (ggml-org#23318)
  common/speculative : fix nullptr crash in get_devices_str (ggml-org#23386)
  mtmd : DeepSeek-OCR image processing fixes, img_tool::resize padding refactor (ggml-org#23345)
  vulkan: optimize operations in the IM2COL shader (ggml-org#22685)
  feat: Add WAV MIME type variants and improve audio format detection (ggml-org#23396)
  hexagon: HMX quantized matmul rework (ggml-org#23368)
  Programmatic Dependent Launch (PDL) for more performance on newer NVIDIA GPUs (Hopper+) (ggml-org#22522)
  app : introduce the llama unified executable (ggml-org#23296)
  refactor: Move text attachments up before the message content in chat completions payload (ggml-org#23406)
  mtmd: fit_params now take into account mmproj (ggml-org#21489)
  ...
srossitto79 pushed a commit to srossitto79/llama.cpp that referenced this pull request May 23, 2026
* webui: Add max image size option

* remove magic numbers

* support all image formats

* use const

* Move regex to match b64 images to constants

* use SETTINGS_KEYS to get max image resolution setting

* Do not touch the image if already under the size threshold
fewtarius pushed a commit to fewtarius/llama.cpp that referenced this pull request May 30, 2026
* webui: Add max image size option

* remove magic numbers

* support all image formats

* use const

* Move regex to match b64 images to constants

* use SETTINGS_KEYS to get max image resolution setting

* Do not touch the image if already under the size threshold
turbo-tan pushed a commit to turbo-tan/llama.cpp-tq3 that referenced this pull request Jun 2, 2026
* webui: Add max image size option

* remove magic numbers

* support all image formats

* use const

* Move regex to match b64 images to constants

* use SETTINGS_KEYS to get max image resolution setting

* Do not touch the image if already under the size threshold
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants