Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sudo required for GPU under headless debian for Vulkan/OpenVINO builds #2581

Open
dusanpol opened this issue Nov 22, 2024 · 0 comments
Open

Comments

@dusanpol
Copy link

Hi,

I have a headless machine running Debian 12 with Intel i5-6500T with integrated GPU (HD Graphics 530).
I compiled whisper and tried to run under user account, however it could not find GPU. Running with elevated privileges (sudo) allowed both Vulkan and OpenVINO builds to run on GPU.

So what is needed to run under unprivileged account?

Example output using OpenVINO under user account:

/home/dipi/whisper/whisper-openvino/server --port 424242 --ov-e-device GPU --model /home/dipi/whisper/whisper-models/ggml-medium.en.bin
whisper_init_from_file_with_params_no_state: loading model from '/home/dipi/whisper/whisper-models/ggml-medium.en.bin'
whisper_init_with_params_no_state: use gpu    = 1
whisper_init_with_params_no_state: flash attn = 0
whisper_init_with_params_no_state: gpu_device = 0
whisper_init_with_params_no_state: dtw        = 0
whisper_init_with_params_no_state: devices    = 1
whisper_init_with_params_no_state: backends   = 1
whisper_model_load: loading model
whisper_model_load: n_vocab       = 51864
whisper_model_load: n_audio_ctx   = 1500
whisper_model_load: n_audio_state = 1024
whisper_model_load: n_audio_head  = 16
whisper_model_load: n_audio_layer = 24
whisper_model_load: n_text_ctx    = 448
whisper_model_load: n_text_state  = 1024
whisper_model_load: n_text_head   = 16
whisper_model_load: n_text_layer  = 24
whisper_model_load: n_mels        = 80
whisper_model_load: ftype         = 1
whisper_model_load: qntvr         = 0
whisper_model_load: type          = 4 (medium)
whisper_model_load: adding 1607 extra tokens
whisper_model_load: n_langs       = 99
whisper_model_load:      CPU total size =  1533.14 MB
whisper_model_load: model size    = 1533.14 MB
whisper_init_state: kv self size  =   50.33 MB
whisper_init_state: kv cross size =  150.99 MB
whisper_init_state: kv pad  size  =    6.29 MB
whisper_init_state: compute buffer (conv)   =   28.55 MB
whisper_init_state: compute buffer (encode) =  170.15 MB
whisper_init_state: compute buffer (cross)  =    7.72 MB
whisper_init_state: compute buffer (decode) =   98.18 MB
whisper_ctx_init_openvino_encoder_with_state: loading OpenVINO model from '/home/dipi/whisper/whisper-models/ggml-medium.en-encoder-openvino.xml'
whisper_ctx_init_openvino_encoder_with_state: first run on a device may take a while ...
whisper_openvino_init: path_model = /home/dipi/whisper/whisper-models/ggml-medium.en-encoder-openvino.xml, device = GPU, cache_dir = /home/dipi/whisper/whisper-models/ggml-medium.en-encoder-openvino-cache
in openvino encoder compile routine: exception: Check 'false' failed at src/inference/src/core.cpp:114:
Check '!device_map.empty()' failed at src/plugins/intel_gpu/src/plugin/plugin.cpp:539:
[GPU] Can't get DEVICE_ID property as no supported devices found or an error happened during devices query.
[GPU] Please check OpenVINO documentation for GPU drivers setup guide.



whisper_ctx_init_openvino_encoder_with_state: failed to init OpenVINO encoder from '/home/dipi/whisper/whisper-models/ggml-medium.en-encoder-openvino.xml'

whisper server listening at http://127.0.0.1:424242

Example output using OpenVINO with sudo:

sudo /home/dipi/whisper/whisper-openvino/server --port 424242 --ov-e-device GPU --model /home/dipi/whisper/whisper-models/ggml-medium.en.bin
[sudo] password for dipi: 
whisper_init_from_file_with_params_no_state: loading model from '/home/dipi/whisper/whisper-models/ggml-medium.en.bin'
whisper_init_with_params_no_state: use gpu    = 1
whisper_init_with_params_no_state: flash attn = 0
whisper_init_with_params_no_state: gpu_device = 0
whisper_init_with_params_no_state: dtw        = 0
whisper_init_with_params_no_state: devices    = 1
whisper_init_with_params_no_state: backends   = 1
whisper_model_load: loading model
whisper_model_load: n_vocab       = 51864
whisper_model_load: n_audio_ctx   = 1500
whisper_model_load: n_audio_state = 1024
whisper_model_load: n_audio_head  = 16
whisper_model_load: n_audio_layer = 24
whisper_model_load: n_text_ctx    = 448
whisper_model_load: n_text_state  = 1024
whisper_model_load: n_text_head   = 16
whisper_model_load: n_text_layer  = 24
whisper_model_load: n_mels        = 80
whisper_model_load: ftype         = 1
whisper_model_load: qntvr         = 0
whisper_model_load: type          = 4 (medium)
whisper_model_load: adding 1607 extra tokens
whisper_model_load: n_langs       = 99
whisper_model_load:      CPU total size =  1533.14 MB
whisper_model_load: model size    = 1533.14 MB
whisper_init_state: kv self size  =   50.33 MB
whisper_init_state: kv cross size =  150.99 MB
whisper_init_state: kv pad  size  =    6.29 MB
whisper_init_state: compute buffer (conv)   =   28.55 MB
whisper_init_state: compute buffer (encode) =  170.15 MB
whisper_init_state: compute buffer (cross)  =    7.72 MB
whisper_init_state: compute buffer (decode) =   98.18 MB
whisper_ctx_init_openvino_encoder_with_state: loading OpenVINO model from '/home/dipi/whisper/whisper-models/ggml-medium.en-encoder-openvino.xml'
whisper_ctx_init_openvino_encoder_with_state: first run on a device may take a while ...
whisper_openvino_init: path_model = /home/dipi/whisper/whisper-models/ggml-medium.en-encoder-openvino.xml, device = GPU, cache_dir = /home/dipi/whisper/whisper-models/ggml-medium.en-encoder-openvino-cache
whisper_ctx_init_openvino_encoder_with_state: OpenVINO model loaded

whisper server listening at http://127.0.0.1:424242
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant