optimize gdn decode bf16 state kernel for mtp with caching. #3127
Draft
ameynaik-hub wants to merge 4 commits intoflashinfer-ai:mainfrom
Draft
optimize gdn decode bf16 state kernel for mtp with caching. #3127ameynaik-hub wants to merge 4 commits intoflashinfer-ai:mainfrom
ameynaik-hub wants to merge 4 commits intoflashinfer-ai:mainfrom