Skip to content

UPSTREAM PR #19227: Fix Issue !19219#1100

Open
loci-dev wants to merge 1 commit intomainfrom
loci/pr-19227-revert_scale_queue_env
Open

UPSTREAM PR #19227: Fix Issue !19219#1100
loci-dev wants to merge 1 commit intomainfrom
loci/pr-19227-revert_scale_queue_env

Conversation

@loci-dev
Copy link

Note

Source pull request: ggml-org/llama.cpp#19227

Hangs were reported on Jetson Orin AGX if we set CUDA_SCALE_LAUNCH_QUEUES=4x (Issue #19219). Reverting the previous PR (#19042) and updating the document to consider setting CUDA_SCALE_LAUNCH_QUEUES=4x for faster throughput on multi-GPU systems.

Hangs were reported on Jetson Orin AGX if we set CUDA_SCALE_LAUNCH_QUEUES=4x. Reverting the previous PR (#19042) and updating the document to consider setting CUDA_SCALE_LAUNCH_QUEUES=4x for faster throughput on multi-GPU systems.
@loci-review
Copy link

loci-review bot commented Jan 31, 2026

No meaningful performance changes were detected across 115327 analyzed functions in the following binaries: build.bin.libllama.so, build.bin.llama-tts, build.bin.llama-cvector-generator, build.bin.libmtmd.so, build.bin.libggml.so, build.bin.libggml-cpu.so, build.bin.libggml-base.so, build.bin.llama-bench, build.bin.llama-gemma3-cli, build.bin.llama-gguf-split, build.bin.llama-llava-cli, build.bin.llama-minicpmv-cli, build.bin.llama-quantize, build.bin.llama-tokenize, build.bin.llama-qwen2vl-cli.

🔎 Full breakdown: Loci Inspector.
💬 Questions? Tag @loci-dev.

@loci-dev loci-dev force-pushed the main branch 26 times, most recently from 237828b to b128b33 Compare February 1, 2026 11:09
@loci-dev loci-dev force-pushed the main branch 30 times, most recently from cd152fa to ab12294 Compare February 3, 2026 11:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants