[WIP][webgpu] Apply dp4a for generation shader #23585

qjia7 · 2025-02-05T09:03:01Z

Description

This pr applies DP4A to generation shader. It uses similar ideas compared with previous generation shader, but different data types.

In easy mode: NV 42 tokens/s -> 45 tokens/s, Meteor Lake 19 tokens/s -> 20 tokens/s

Without this PR
NV

Kernel	Time (ms)	Percentage (%)
MatMulNBits	14.28	71.49

Meteor Lake

Kernel	Time (ms)	Percentage (%)
MatMulNBits	34.74	79.93

With this PR
NV

Kernel	Time (ms)	Percentage (%)
MatMulNBits\|DP4AMatMulNBitsSmallMProgram	11.24	71.38

Meteor Lake

Kernel	Time (ms)	Percentage (%)
MatMulNBits\|DP4AMatMulNBitsSmallMProgram	31.71	76.94

But the result is still not correct

qjia7 · 2025-03-17T09:13:27Z

Replaced by #24064

qjia7 added 7 commits February 7, 2025 10:06

apply dp4a for generation shader

b99c836

increase tile size to 32

ae07a8f

write 8x4 data

fa9d58c

load multiple tile size A

5deaac4

Try subgroupShuffle A

8afbe25

Add subgroupShuffle B vertion which is similar with prefill shader

eebae4c

But the result is still not correct

Test

bcb37fe

qjia7 force-pushed the matmulnbits-generation branch from 6b5f15e to bcb37fe Compare February 8, 2025 02:21

qjia7 added 6 commits February 8, 2025 11:10

reduce workgroup size to 64

82b3cfd

use componets = 4 for B

29176a4

nits

19d28ec

check K bounary

577e116

remove unused codes

502c3ca

Add back accuracy_level

0bb529a

guschmue added the ep:WebGPU ort-web webgpu provider label Feb 10, 2025

qjia7 closed this Mar 17, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[WIP][webgpu] Apply dp4a for generation shader #23585

[WIP][webgpu] Apply dp4a for generation shader #23585

Uh oh!

qjia7 commented Feb 5, 2025 •

edited

Loading

Uh oh!

qjia7 commented Mar 17, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

[WIP][webgpu] Apply dp4a for generation shader #23585

[WIP][webgpu] Apply dp4a for generation shader #23585

Uh oh!

Conversation

qjia7 commented Feb 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Uh oh!

qjia7 commented Mar 17, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

qjia7 commented Feb 5, 2025 •

edited

Loading