Skip to content

Commit 34a0e96

Browse files
authored
[Kernel] changing fused moe kernel chunk size default to 32k (vllm-project#7995)
1 parent 80c7b08 commit 34a0e96

File tree

1 file changed

+1
-1
lines changed

1 file changed

+1
-1
lines changed

vllm/envs.py

+1-1
Original file line numberDiff line numberDiff line change
@@ -352,7 +352,7 @@ def get_default_config_root():
352352
os.path.join(get_default_cache_root(), "vllm", "xla_cache"),
353353
)),
354354
"VLLM_FUSED_MOE_CHUNK_SIZE":
355-
lambda: int(os.getenv("VLLM_FUSED_MOE_CHUNK_SIZE", "65536")),
355+
lambda: int(os.getenv("VLLM_FUSED_MOE_CHUNK_SIZE", "32768")),
356356

357357
# If set, vllm will skip the deprecation warnings.
358358
"VLLM_NO_DEPRECATION_WARNING":

0 commit comments

Comments
 (0)