vulkan: optimize UMA buffer operations and fix driver hangs by giuseppe · Pull Request #16059 · ggml-org/llama.cpp

giuseppe · 2025-09-17T21:54:13Z

The previous implementation was blocking the GPU for extended periods, causing the i915 driver to reset the context due to the hangcheck protection.

[32628.443070] i915 0000:00:02.0: [drm] GPU HANG: ecode 12:1:85dffffb, in llama-server [194114]
[32628.443091] i915 0000:00:02.0: [drm] llama-server[194114] context reset due to GPU hang

Make sure to read the contributing guidelines before submitting a PR

0cc4m

I forgot to implement this for the memset function, thank you. If it would help to be able to do this asynchronously as well, we could instead implement a "deferred_memset" (see deferred_memcpy for reference), but this would require more changes.

giuseppe · 2025-09-18T13:53:44Z

If it would help to be able to do this asynchronously as well, we could instead implement a "deferred_memset" (see deferred_memcpy for reference), but this would require more changes.

I've added a new commit to implement deferred_memset similarly to what is done for deferred_memcpy

The previous implementation was blocking the GPU for extended periods, causing the i915 driver to reset the context due to the hangcheck protection. [32628.443070] i915 0000:00:02.0: [drm] GPU HANG: ecode 12:1:85dffffb, in llama-server [194114] [32628.443091] i915 0000:00:02.0: [drm] llama-server[194114] context reset due to GPU hang Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>

0cc4m

Thank you!

…#16059) * vulkan: optimize UMA buffer operations and fix driver hangs The previous implementation was blocking the GPU for extended periods, causing the i915 driver to reset the context due to the hangcheck protection. [32628.443070] i915 0000:00:02.0: [drm] GPU HANG: ecode 12:1:85dffffb, in llama-server [194114] [32628.443091] i915 0000:00:02.0: [drm] llama-server[194114] context reset due to GPU hang * vulkan: implement deferred_memset on UMA --------- Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>

* vulkan: optimize UMA buffer operations and fix driver hangs The previous implementation was blocking the GPU for extended periods, causing the i915 driver to reset the context due to the hangcheck protection. [32628.443070] i915 0000:00:02.0: [drm] GPU HANG: ecode 12:1:85dffffb, in llama-server [194114] [32628.443091] i915 0000:00:02.0: [drm] llama-server[194114] context reset due to GPU hang * vulkan: implement deferred_memset on UMA --------- Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>

…#16059) * vulkan: optimize UMA buffer operations and fix driver hangs The previous implementation was blocking the GPU for extended periods, causing the i915 driver to reset the context due to the hangcheck protection. [32628.443070] i915 0000:00:02.0: [drm] GPU HANG: ecode 12:1:85dffffb, in llama-server [194114] [32628.443091] i915 0000:00:02.0: [drm] llama-server[194114] context reset due to GPU hang * vulkan: implement deferred_memset on UMA --------- Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>

giuseppe requested a review from 0cc4m as a code owner September 17, 2025 21:54

github-actions Bot added Vulkan Issues specific to the Vulkan backend ggml changes relating to the ggml tensor library for machine learning labels Sep 17, 2025

0cc4m reviewed Sep 18, 2025

View reviewed changes

Comment thread ggml/src/ggml-vulkan/ggml-vulkan.cpp

giuseppe added 2 commits September 18, 2025 16:11

vulkan: implement deferred_memset on UMA

87d3cd0

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>

giuseppe force-pushed the access-directly-mem-with-uma branch from 34ba782 to 87d3cd0 Compare September 18, 2025 14:11

0cc4m approved these changes Sep 21, 2025

View reviewed changes

0cc4m merged commit 1eeb523 into ggml-org:master Sep 21, 2025
54 of 55 checks passed

neilopet mentioned this pull request Mar 1, 2026

vulkan: add UMA zero-copy async transfers and fix event_record deferred memcpy handling #20018

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

vulkan: optimize UMA buffer operations and fix driver hangs#16059

vulkan: optimize UMA buffer operations and fix driver hangs#16059
0cc4m merged 2 commits into
ggml-org:masterfrom
giuseppe:access-directly-mem-with-uma

giuseppe commented Sep 17, 2025

Uh oh!

0cc4m left a comment

Uh oh!

Uh oh!

giuseppe commented Sep 18, 2025

Uh oh!

0cc4m left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

giuseppe commented Sep 17, 2025

Uh oh!

0cc4m left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

giuseppe commented Sep 18, 2025

Uh oh!

0cc4m left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants