fix: Autoscroll detection by allozaur · Pull Request #23026 · ggml-org/llama.cpp

allozaur · 2026-05-13T21:47:28Z

Overview

Fix for a regression after #22977 that caused inability to disable autoscroll during streaming when scrolling up.

Requirements

I have read and agree with the contributing guidelines
AI usage disclosure: YES, code changes analysis

@arthw

commit dbe7901 Author: Ruben Ortlam <rortlam@redhat.com> Date: Thu May 14 10:36:54 2026 +0200 vulkan: fix matmul integer pipeline selection (ggml-org#23005) * vulkan: fix matmul integer pipeline selection * gate pipeline creation with the right bools commit 320a6a4 Author: Aleksander Grygier <aleksander.grygier@gmail.com> Date: Thu May 14 08:09:29 2026 +0200 fix: Autoscroll detection (ggml-org#23026) commit 9ed6e19 Author: Katostrofik <georgiopapairo@gmail.com> Date: Thu May 14 01:39:14 2026 -0400 SYCL: fix multi-GPU system RAM exhaustion by using Level Zero allocations (ggml-org#21597) * SYCL: fix multi-GPU system RAM exhaustion by using Level Zero allocations Replace sycl::malloc_device with zeMemAllocDevice for GPU memory allocation in the SYCL backend. sycl::malloc_device triggers the xe kernel driver's DMA-buf/TTM path which mirrors every VRAM allocation 1:1 in system RAM. zeMemAllocDevice uses the SVM/P2P path with no host staging. On a dual Intel Arc Pro B70 system (64GB VRAM, 64GB RAM), a 15.6 GiB model consumed 60 GiB of system RAM via sycl::malloc_device, causing OOM crashes. With zeMemAllocDevice, the same workload uses ~6.7 GiB of system RAM with no performance regression. All Level Zero calls include automatic fallback to the original SYCL allocation path if Level Zero interop is unavailable. * SYCL: address review feedback - remove try/catch, check device types, deduplicate - Remove try/catch from malloc/free/memcpy helpers, check backend and device type upfront instead (ggml_sycl_is_level_zero, ggml_sycl_is_dgpu) - Move shared helpers (is_level_zero, is_dgpu, free_device) to common.cpp and declare in common.hpp to eliminate code duplication - Use SYCL_CHECK(CHECK_TRY_ERROR()) for fallback sycl::free calls - Guard dev2dev_memcpy L0 path to dGPU-to-dGPU only, preserving the host-staged path for iGPU-to-dGPU transfers - Add Windows Level Zero SDK path detection (LEVEL_ZERO_V1_SDK_PATH) in CMakeLists.txt (co-authored with @arthw) * SYCL: add build/runtime flags for Level Zero, address review feedback Implements the architecture suggested by @arthw: compile-time and runtime flags to cleanly separate Level Zero and SYCL memory API paths. - Add GGML_SYCL_SUPPORT_LEVEL_ZERO cmake option (default ON). All Level Zero code is wrapped in #ifdef so the build works on systems without the Level Zero SDK installed (e.g. CPU-only CI servers). Both the loader library and headers are checked before enabling. - Add GGML_SYCL_ENABLE_LEVEL_ZERO runtime env var (default 1). Controls whether Level Zero or SYCL memory APIs are used. Only one API style is used per session, no mixing. If Level Zero is enabled but the devices don't support the Level Zero backend, it auto-disables with a warning. - Remove Level Zero code from dpct_malloc. It was unused (dpct::device_memory is not called anywhere in the backend) and used try/catch for flow control. - Update SYCL.md with documentation for both new parameters. Tested on Intel Arc Pro B70 (32GB), single-GPU and dual-GPU, with both GGML_SYCL_SUPPORT_LEVEL_ZERO=ON and OFF builds. AI-assisted development (Claude). Code reviewed and tested on my hardware. * SYCL: unify Level Zero malloc/free call sites, address review feedback Move ggml_sycl_malloc_device to common.cpp alongside ggml_sycl_free_device. Both functions are now unconditionally available — Level Zero code is uniform SYCL_CHECK(CHECK_TRY_ERROR()) wrapping with no #ifdef blocks. Addresses arthw's review: wrap all malloc/free in SYCL_CHECK for stack traces on failure, eliminate duplicated #ifdef/else patterns at 6 call sites (-29 lines net). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * SYCL: add Level Zero SDK to CI, fix device check and missed alloc paths Add Level Zero SDK installation to Ubuntu and Windows SYCL CI jobs so the Level Zero code path is compiled and tested in CI. Fix two bugs found during extended dual-GPU testing (no ONEAPI_DEVICE_SELECTOR set): - The Level Zero backend check was iterating all SYCL devices including CPU. The OpenCL CPU device caused Level Zero to be disabled for the GPUs, defeating the fix on multi-GPU systems. Added is_gpu() filter so only GPU devices are checked. - sycl_ext_malloc_device/sycl_ext_free (tensor reorder temp buffers) were still calling sycl::malloc/sycl::free directly, bypassing the Level Zero path. Routed through ggml_sycl_malloc_device/free_device for consistency with the other device memory call sites. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * SYCL: address arthw review feedback on Level Zero memory API structure - Move ggml_sycl_malloc_device to static function in ggml-sycl.cpp; only ggml_sycl_free_device (used by common.cpp) stays in common.cpp - Switch both helpers to use g_ggml_sycl_enable_level_zero global instead of per-call queue backend checks - Remove #ifdef wrapper from global definition; always declare at 0, add #else branch in init block so it stays 0 when L0 not compiled in - Update init loop comment to explain GPU-only device check - CMakeLists: message(STATUS) before the if block; align option wording AI-assisted implementation. Reviewed and tested on dual Intel Arc Pro B70 (32 GB each): test-backend-ops OK on both GPUs, single/dual-GPU Q4_K_M and Q8_0 bench correct, zeMemAllocDevice GTT delta confirmed <5 MiB per 4 GiB allocation (vs ~4 GiB shadow with sycl::malloc_device). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * SYCL: remove unused cstdio/cstdlib includes from common.cpp Leftover from the deleted ggml_sycl_queue_supports_level_zero helper. Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> * Apply suggestions from code review Co-authored-by: Neo Zhang <zhang.jianyu@outlook.com> * SYCL: preserve Level Zero allocation path during early malloc * ci: fix Level Zero package conflict in Intel Docker build * ci: find Level Zero loader in oneAPI package step * ci: allow Windows SYCL package without Level Zero DLL --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Co-authored-by: Neo Zhang <zhang.jianyu@outlook.com>

fix: Autoscroll detection

063ca46

allozaur requested review from ServeurpersoCom and ggerganov May 13, 2026 21:47

allozaur requested a review from a team as a code owner May 13, 2026 21:47

Copilot AI review requested due to automatic review settings May 13, 2026 21:47

ServeurpersoCom approved these changes May 13, 2026

View reviewed changes

github-actions Bot added server/webui examples server labels May 13, 2026

Copilot started reviewing on behalf of allozaur May 14, 2026 01:34 View session

ggerganov approved these changes May 14, 2026

View reviewed changes

ServeurpersoCom merged commit 320a6a4 into ggml-org:master May 14, 2026
11 checks passed

xxmustafacooTR pushed a commit to xxPlayground/llama-cpp-turboquant that referenced this pull request May 14, 2026

fix: Autoscroll detection (ggml-org#23026)

6d63b35

allozaur deleted the allozaur/fix/auto-scroll branch May 14, 2026 17:06

dandm1 pushed a commit to dandm1/llama.cpp that referenced this pull request May 16, 2026

fix: Autoscroll detection (ggml-org#23026)

13ba037

rsenthilkumar6 pushed a commit to rsenthilkumar6/llama.cpp that referenced this pull request May 19, 2026

fix: Autoscroll detection (ggml-org#23026)

5c9a40d

ArberSephirotheca pushed a commit to ArberSephirotheca/llama.cpp that referenced this pull request May 19, 2026

fix: Autoscroll detection (ggml-org#23026)

13120fb

baramofme pushed a commit to baramofme/llama-cpp-turboquant that referenced this pull request May 23, 2026

fix: Autoscroll detection (ggml-org#23026)

980ba96

winstonma pushed a commit to winstonma/llama.cpp that referenced this pull request May 27, 2026

fix: Autoscroll detection (ggml-org#23026)

66b0bce

fewtarius pushed a commit to fewtarius/llama.cpp that referenced this pull request May 30, 2026

fix: Autoscroll detection (ggml-org#23026)

a45c6d3

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: Autoscroll detection#23026

fix: Autoscroll detection#23026
ServeurpersoCom merged 1 commit into
ggml-org:masterfrom
allozaur:allozaur/fix/auto-scroll

allozaur commented May 13, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

allozaur commented May 13, 2026

Overview

Requirements

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants