fix: Reduce memory usage in fused moe op associated with AutoTuning. #3783

hyukn · 2025-04-23T03:01:49Z

Replace the pre-defined bucket sizes with a generating function based on the tune_max_num_tokens in fused moe op tuning.
Add free_memory logic of workspace in min_latency_mode fused moe path.

* Replace pre-defined bucket size strategy with a generating function based on the tune_max_num_tokens. * Add free_memory logic of workspace in min_latency_mode fused moe path. Signed-off-by: Yukun He <[email protected]>

hyukn · 2025-04-23T03:02:54Z

/bot run

tensorrt-cicd · 2025-04-23T03:08:40Z

PR_Github #3119 [ run ] triggered by Bot

hyukn · 2025-04-23T06:54:09Z

/bot kill

hyukn · 2025-04-23T06:55:03Z

Retarget to release/0.19 with this PR #3793.
Thus, close this one.

tensorrt-cicd · 2025-04-23T08:27:26Z

PR_Github #3119 [ run ] completed with state SUCCESS
/LLM/main/L0_MergeRequest_PR pipeline #2172 completed with status: 'SUCCESS'

hyukn · 2025-04-23T08:52:22Z

/bot kill

Reduce memory usage in fused moe op associated with AutoTuning.

a627200

* Replace pre-defined bucket size strategy with a generating function based on the tune_max_num_tokens. * Add free_memory logic of workspace in min_latency_mode fused moe path. Signed-off-by: Yukun He <[email protected]>

hyukn requested review from HuiGao-NV and litaotju April 23, 2025 03:01

hyukn self-assigned this Apr 23, 2025

hyukn changed the title ~~Reduce memory usage in fused moe op associated with AutoTuning.~~ fix: Reduce memory usage in fused moe op associated with AutoTuning. Apr 23, 2025

hyukn closed this Apr 23, 2025

hyukn deleted the fix/reduce_autotune_mem_usage branch May 20, 2025 10:46

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix: Reduce memory usage in fused moe op associated with AutoTuning. #3783

fix: Reduce memory usage in fused moe op associated with AutoTuning. #3783

Uh oh!

hyukn commented Apr 23, 2025

Uh oh!

hyukn commented Apr 23, 2025

Uh oh!

tensorrt-cicd commented Apr 23, 2025

Uh oh!

hyukn commented Apr 23, 2025

Uh oh!

hyukn commented Apr 23, 2025

Uh oh!

tensorrt-cicd commented Apr 23, 2025

Uh oh!

hyukn commented Apr 23, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

fix: Reduce memory usage in fused moe op associated with AutoTuning. #3783

fix: Reduce memory usage in fused moe op associated with AutoTuning. #3783

Uh oh!

Conversation

hyukn commented Apr 23, 2025

Uh oh!

hyukn commented Apr 23, 2025

Uh oh!

tensorrt-cicd commented Apr 23, 2025

Uh oh!

hyukn commented Apr 23, 2025

Uh oh!

hyukn commented Apr 23, 2025

Uh oh!

tensorrt-cicd commented Apr 23, 2025

Uh oh!

hyukn commented Apr 23, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants