Skip to content

Conversation

@qjia7
Copy link
Contributor

@qjia7 qjia7 commented Jan 7, 2026

This pull request refactors how input and output tensor shape information is stored and accessed in the WebGPU context. Instead of keeping references to the full input and output tensors, only their shapes are now stored, which helps avoid issues with accessing released tensors during profiling.

Before
"inputs[0] = {1,1,768} inputs[1] = {200064,96,1} inputs[2] = {} outputs[0] = {} "
After
"inputs[0] = {1,1,768} inputs[1] = {200064,96,1} inputs[2] = {19206144} outputs[0] = {1,1,200064} "

@qjia7 qjia7 requested review from fs-eire and guschmue January 7, 2026 10:05
@qjia7 qjia7 marked this pull request as ready for review January 7, 2026 10:06
@guschmue guschmue added the ep:WebGPU ort-web webgpu provider label Jan 7, 2026
guschmue
guschmue previously approved these changes Jan 7, 2026
Copy link
Contributor

@fs-eire fs-eire left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like the overriden shape is not preserved. Is this OK for the profiling?

@qjia7
Copy link
Contributor Author

qjia7 commented Jan 8, 2026

It looks like the overriden shape is not preserved. Is this OK for the profiling?

I did this intentionally. The overridden shape isn’t very straightforward. Or should we keep both versions?

@xiaofeihan1
Copy link
Contributor

It looks like the overriden shape is not preserved. Is this OK for the profiling?

I did this intentionally. The overridden shape isn’t very straightforward. Or should we keep both versions?

I think in this PR you can simply fix the empty issue and continue using the overridden shape.

Note: The input shape of each operator can be viewed in the CPU event. However, the program of each operator is not always displayed. For example, GQA is shown, but operators like CopyKVCache, FlashAttentionDecodeQKT, etc., are not shown.

image

@qjia7
Copy link
Contributor Author

qjia7 commented Jan 12, 2026

It looks like the overriden shape is not preserved. Is this OK for the profiling?

I did this intentionally. The overridden shape isn’t very straightforward. Or should we keep both versions?

Keeping both versions will result many same shapes if use_override_shape is false, which is also not a good method. Restore to use override shape instead of original shape since this is the actual shape used in shader.

@qjia7 qjia7 requested review from fs-eire and guschmue January 13, 2026 00:19
@qjia7 qjia7 merged commit aeac757 into main Jan 14, 2026
91 checks passed
@qjia7 qjia7 deleted the fix_profiling branch January 14, 2026 06:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ep:WebGPU ort-web webgpu provider

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants