-
Notifications
You must be signed in to change notification settings - Fork 3.7k
[WebGPU] allow async shader compilation #25941
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
I actually launch ORT-WEB in worker, so these GPU blocks appear regardless of whether it is launched in worker or in main thread |
Do you mean that the UI responsiveness problem mentioned in #25882 is caused by GPU exhausted but not caused by the UI threads running JavaScript? |
Yes, the main problem is that when the model initialized, it causes large GPU operations (not CPU operations in the main thread) that lock up the GPU and prevent the user interface from being rendered, which is also rendered using the GPU. The image shows that during large GPU-based operations, frames were not rendered. |
|
I think the async compilation is resolving the cpu issue that gpu process is occupied a long time due to shader compilation. The UI threads' render commands have to wait on gpu process until one CreateComputePipeline is finished. So with this change, the CreateComputePipeline is moved into a gpu thread and won't block the gpu main thread so that the ui commands can send to gpu in time. |
Yeah, that would be great |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR refactors the WebGPU shader compilation to use asynchronous pipeline creation, improving application responsiveness when running in the main thread. The change replaces synchronous CreateComputePipeline with CreateComputePipelineAsync to avoid blocking while waiting for shader compilation to complete.
Key Changes
ProgramManagerconstructor now accepts aWebGpuContextreference instead of separate device and limits parameters- Shader compilation changed from synchronous to asynchronous using
CreateComputePipelineAsyncwith callback-based completion handling - Error handling added for async pipeline creation failures
Reviewed Changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
| onnxruntime/core/providers/webgpu/webgpu_context.cc | Updated ProgramManager instantiation to pass WebGpuContext reference |
| onnxruntime/core/providers/webgpu/program_manager.h | Modified constructor to accept WebGpuContext reference and updated member variables |
| onnxruntime/core/providers/webgpu/program_manager.cc | Implemented async shader compilation with CreateComputePipelineAsync and callback handling |
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
|
@microsoft-github-policy-service rerun |
### Description Reduce the time blocked waiting for the shader to be compiled. ### Motivation and Context Try to optimize the responsiveness of the application when running ort-web in main thread. See microsoft#25882

Description
Reduce the time blocked waiting for the shader to be compiled.
Motivation and Context
Try to optimize the responsiveness of the application when running ort-web in main thread. See #25882