Skip to content

Bug: gpu hang after bde7cd3cd949c1a85d3a199498ac98e78039d46f #7730

@rhjdvsgsgks

Description

@rhjdvsgsgks

What happened?

after bde7cd3 . inferring any llama3 q6 model will cause a gpu hang. previous version (a5735e4) is not affected

Name and Version

bde7cd3
using vulkan backend

What operating system are you seeing the problem on?

Linux

Relevant log output

radv/amdgpu: The CS has been cancelled because the context is lost. This context is innocent.
terminate called after throwing an instance of 'vk::DeviceLostError'                   what():  vk::Queue::submit: ErrorDeviceLost

dmesg

[drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring comp_1.1.0 timeout, signaled seq=898, emitted seq=899

Metadata

Metadata

Assignees

No one assigned

    Labels

    bug-unconfirmedmedium severityUsed to report medium severity bugs in llama.cpp (e.g. Malfunctioning Features but still useable)

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions