Skip to content

Conversation

@bnellnm
Copy link
Collaborator

@bnellnm bnellnm commented Sep 20, 2025

Purpose

Add documentation describing the features supported by each moe kernel (modular or non-modular) and each PrepareAndFinalize class.

Test Plan

N/A

Test Result

N/A

cc @tlrmchlsmth , @robertgshaw2-redhat , @mgoin , @varun-sundar-rabindranath , @simon-mo , @WoosukKwon

@mergify mergify bot added the documentation Improvements or additions to documentation label Sep 20, 2025
@bnellnm bnellnm marked this pull request as ready for review September 23, 2025 21:13
@varun-sundar-rabindranath
Copy link
Contributor

Nice. Thanks Bill. The vllm/docs/design/fused_moe_modular_kernel.md doc tries to list all the available prepare-finalize and fused-experts implementations. Do you think we could remove that part and link this features doc there ?

Copy link
Contributor

@varun-sundar-rabindranath varun-sundar-rabindranath left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed the content. Thanks Bill!

@hmellor
Copy link
Member

hmellor commented Sep 24, 2025

Preview available here https://vllm--25297.org.readthedocs.build/en/25297/design/moe_kernel_features.html

2 things:

  • Not all of the API cross references appear to be correct
  • The tables are quite wide which hurts readability a little. There are CSS tricks we can do to make them more compact but I'm not sure that'll be enough

@bnellnm
Copy link
Collaborator Author

bnellnm commented Sep 25, 2025

Preview available here https://vllm--25297.org.readthedocs.build/en/25297/design/moe_kernel_features.html

2 things:

  • Not all of the API cross references appear to be correct
  • The tables are quite wide which hurts readability a little. There are CSS tricks we can do to make them more compact but I'm not sure that'll be enough

I think I've fixed all the links. One is for a PR that hasn't landed yet so will not work.

@bnellnm bnellnm requested a review from hmellor September 25, 2025 20:55
@hmellor
Copy link
Member

hmellor commented Sep 29, 2025

One is for a PR that hasn't landed yet so will not work.

That entry should be added in that PR. Including a broken link will cause the docs build to fail on main.


The tables are still very wide, could you try something like

<style>
td:not(:first-child) {
text-align: center !important;
}
td {
padding: 0.5rem !important;
white-space: nowrap;
}
th {
padding: 0.5rem !important;
min-width: 0 !important;
}
th:not(:first-child) {
writing-mode: vertical-lr;
transform: rotate(180deg)
}
</style>
that we use for other wide tables?

@bnellnm bnellnm requested a review from hmellor September 29, 2025 15:03
@bnellnm
Copy link
Collaborator Author

bnellnm commented Sep 29, 2025

The <style> section is triggering the linter. I'm not sure how to fix it (or if it can since I copied it directly from README.md).

@bnellnm
Copy link
Collaborator Author

bnellnm commented Sep 29, 2025

@hmellor , I think I've addressed all the comments. Can you take another look when you get a chance?

@hmellor
Copy link
Member

hmellor commented Sep 29, 2025

Thanks for making the changes so far. I'm going to have a look locally to see if I can make the tables render nicely

@hmellor
Copy link
Member

hmellor commented Sep 29, 2025

I can't push changes directly to this PR because it comes from an organisation's fork. Please apply the following patch

fix.patch
diff --git a/docs/design/fused_moe_modular_kernel.md b/docs/design/fused_moe_modular_kernel.md
index f865b764e..ee5701989 100644
--- a/docs/design/fused_moe_modular_kernel.md
+++ b/docs/design/fused_moe_modular_kernel.md
@@ -242,8 +242,8 @@ Example: `python3 -m tests.kernels.moe.modular_kernel_tools.profile_modular_kern
 
 ## FusedMoEPrepareAndFinalize Implementations
 
-See [Fused MoE Kernel features](./moe_kernel_features.md#Fused-MoE-Modular-All2All-backends) for a list of all the available modular prepare and finalize subclasses.
+See [Fused MoE Kernel features](./moe_kernel_features.md#fused-moe-modular-all2all-backends) for a list of all the available modular prepare and finalize subclasses.
 
 ## FusedMoEPermuteExpertsUnpermute
 
-See [Fused MoE Kernel features](./moe_kernel_features.md#Fused-MoE-Experts-Kernels) for a list of all the available modular experts.
+See [Fused MoE Kernel features](./moe_kernel_features.md#fused-moe-experts-kernels) for a list of all the available modular experts.
diff --git a/docs/design/moe_kernel_features.md b/docs/design/moe_kernel_features.md
index 6e2727a57..6f3210fa7 100644
--- a/docs/design/moe_kernel_features.md
+++ b/docs/design/moe_kernel_features.md
@@ -19,9 +19,6 @@ Certain models require the topk weights to be applied to the input activations r
 unless otherwise specified, backends are controlled via `VLLM_ALL2ALL_BACKEND`.  All backends except `flashinfer` only work with EP+DP or EP+TP. `Flashinfer` can work with EP or DP w/o EP.
 
 <style>
-td:not(:first-child) {
-  text-align: center !important;
-}
 td {
   padding: 0.5rem !important;
   white-space: nowrap;
@@ -31,11 +28,6 @@ th {
   padding: 0.5rem !important;
   min-width: 0 !important;
 }
-
-th:not(:first-child) {
-  writing-mode: vertical-lr;
-  transform: rotate(180deg)
-}
 </style>
 
 | Backend                               | Output act. format | Quant. types    | Quant. format          | Async | Apply Weight On Input | Sub-class                                                                                                                                                     |
@@ -44,27 +36,25 @@ th:not(:first-child) {
 | pplx                                  | batched            | fp8,int8        | G,A,T                  | Y     | Y                     | [`PplxPrepareAndFinalize`][vllm.model_executor.layers.fused_moe.pplx_prepare_finalize.PplxPrepareAndFinalize]                                                 |
 | deepep_high_throughput                | standard           | fp8             | G(128),A,T<sup>2</sup> | Y     | Y                     | [`DeepEPLLPrepareAndFinalize`][vllm.model_executor.layers.fused_moe.deepep_ll_prepare_finalize.DeepEPLLPrepareAndFinalize]                                    |
 | deepep_low_latency                    | batched            | fp8             | G(128),A,T<sup>3</sup> | Y     | Y                     | [`DeepEPHTPrepareAndFinalize`][vllm.model_executor.layers.fused_moe.deepep_ht_prepare_finalize.DeepEPHTPrepareAndFinalize]                                    |
-| flashinfer_all2allv                   | standard           | nvfp4,fp8       | G,A,T                  | N     | N                     | [`FlashInferAllToAllMoEPrepareAndFinalize`][vllm.model_executor.layers.fused_moe.flashinfer_cutlass_prepare_finalize.FlashInferAllToAllMoEPrepareAndFinalize] |
 | flashinfer<sup>4</sup>                | standard           | nvfp4,fp8       | G,A,T                  | N     | N                     | [`FlashInferCutlassMoEPrepareAndFinalize`][vllm.model_executor.layers.fused_moe.flashinfer_cutlass_prepare_finalize.FlashInferCutlassMoEPrepareAndFinalize]   |
 | flashinfer<sup>4</sup>                | standard           | nvfp4,fp8       | G,A,T                  | N     | N                     | [`FlashInferCutlassMoEPrepareAndFinalize`][vllm.model_executor.layers.fused_moe.flashinfer_cutlass_prepare_finalize.FlashInferCutlassMoEPrepareAndFinalize]   |
 | MoEPrepareAndFinalizeNoEP<sup>5</sup>    | standard           | fp8,int8        | G,A,T                  | N     | Y                     | [`MoEPrepareAndFinalizeNoEP`][vllm.model_executor.layers.fused_moe.prepare_finalize.MoEPrepareAndFinalizeNoEP]                                                      |
 | BatchedPrepareAndFinalize<sup>5</sup> | batched            | fp8,int8        | G,A,T                  | N     | Y                     | [`BatchedPrepareAndFinalize`][vllm.model_executor.layers.fused_moe.fused_batched_moe.BatchedPrepareAndFinalize]                                               |
 
-1. All types: mxfp4, nvfp4, int4, int8, fp8
-2. A,T quantization occurs after dispatch.
-3. All quantization happens after dispatch.
-4. Controlled by different env vars (`VLLM_FLASHINFER_MOE_BACKEND` "throughput" or "latency")
-5. This is a no-op dispatcher that can be used to pair with any modular experts to produce a modular kernel that runs w/o dispatch or combine.  These cannot be selected via environment variable.  These are generally use for testing or adapting an expert subclass to the `fused_experts` API.
-6. This depends on the experts implementation.
+!!! info "Table key"
+    1. All types: mxfp4, nvfp4, int4, int8, fp8
+    2. A,T quantization occurs after dispatch.
+    3. All quantization happens after dispatch.
+    4. Controlled by different env vars (`VLLM_FLASHINFER_MOE_BACKEND` "throughput" or "latency")
+    5. This is a no-op dispatcher that can be used to pair with any modular experts to produce a modular kernel that runs w/o dispatch or combine.  These cannot be selected via environment variable.  These are generally use for testing or adapting an expert subclass to the `fused_experts` API.
+    6. This depends on the experts implementation.
 
-### Quantization format key
+    ---
 
-| Quantization formats        | Symbol |
-|-----------------------------|--------|
-| Grouped                     | G      |
-| Grouped w/block size N      | G(N)   |
-| Per activation token        | A      |
-| Per tensor                  | T      |
+    - G - Grouped
+    - G(N) - Grouped w/block size N
+    - A - Per activation token
+    - T - Per tensor
 
 Modular kernels are supported by the following `FusedMoEMethodBase` classes.
 
@@ -93,28 +83,29 @@ To be used with a particular `FusedMoEPrepareAndFinalize` sub-class, MoE kernels
 
 | Kernel                       | Input act. format | Quant. types    | Quant. format | Activation function                         | Apply Weight On Input | Modular | Source                                                                                                                                                                                                 |
 |------------------------------|-------------------|-----------------|---------------|---------------------------------------------|-----------------------|---------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
-| triton                       | standard          | all<sup>1</sup> | G,A,T         | silu, gelu, swigluoai, silu_no_mul, gelu_no_mul | Y                     | Y       | [`fused_experts`][vllm.model_executor.layers.fused_moe.fused_moe.fused_experts], [`TritonExperts`][vllm.model_executor.layers.fused_moe.fused_moe.TritonExperts]                                                                                                                  |
+| triton                       | standard          | all<sup>1</sup> | G,A,T         | silu, gelu,</br>swigluoai,</br>silu_no_mul,</br>gelu_no_mul | Y                     | Y       | [`fused_experts`][vllm.model_executor.layers.fused_moe.fused_moe.fused_experts],</br>[`TritonExperts`][vllm.model_executor.layers.fused_moe.fused_moe.TritonExperts]                                                                                                                  |
 | triton (batched)             | batched           | all<sup>1</sup> | G,A,T         | silu, gelu                                   | <sup>6</sup>          | Y       | [`BatchedTritonExperts`][vllm.model_executor.layers.fused_moe.fused_batched_moe.BatchedTritonExperts]                                                                                                                    |
-| deep gemm                    | standard, batched | fp8             | G(128),A,T    | silu, gelu                                   | <sup>6</sup>          | Y       | [`deep_gemm_moe_fp8`][vllm.model_executor.layers.fused_moe.deep_gemm_moe.deep_gemm_moe_fp8], [`DeepGemmExperts`][vllm.model_executor.layers.fused_moe.deep_gemm_moe.DeepGemmExperts], [`BatchedDeepGemmExperts`][vllm.model_executor.layers.fused_moe.batched_deep_gemm_moe.BatchedDeepGemmExperts]             |
-| cutlass_fp4                  | standard, batched | nvfp4           | A,T           | silu                                        | Y                     | Y       | [`cutlass_moe_fp4`][vllm.model_executor.layers.fused_moe.cutlass_moe.cutlass_moe_fp4], [`CutlassExpertsFp4`][vllm.model_executor.layers.fused_moe.cutlass_moe.CutlassExpertsFp4]                                                                                                          |
-| cutlass_fp8                  | standard, batched | fp8             | A,T           | silu, gelu                                   | Y                     | Y       | [`cutlass_moe_fp8`][vllm.model_executor.layers.fused_moe.cutlass_moe.cutlass_moe_fp8], [`CutlassExpertsFp8`][vllm.model_executor.layers.fused_moe.cutlass_moe.CutlassExpertsFp8], [`CutlasBatchedExpertsFp8`][vllm.model_executor.layers.fused_moe.cutlass_moe.CutlassBatchedExpertsFp8]                                                                               |
-| flashinfer                   | standard          | nvfp4,fp8       | T             | <sup>5</sup>                                | N                     | Y       | [`flashinfer_cutlass_moe_fp4`][vllm.model_executor.layers.fused_moe.flashinfer_cutlass_moe.flashinfer_cutlass_moe_fp4], [`FlashInferExperts`][vllm.model_executor.layers.fused_moe.flashinfer_cutlass_moe.FlashInferExperts]                                                                                    |
-| gpt oss triton               | batched           | N/A             | N/A           | <sup>5</sup>                                | Y                     | Y       | [`triton_kernel_fused_experts`][vllm.model_executor.layers.fused_moe.gpt_oss_triton_kernels_moe.triton_kernel_fused_experts], [`BatchedOAITritonExperts`][vllm.model_executor.layers.fused_moe.gpt_oss_triton_kernels_moe.BatchedOAITritonExperts]                                                                         |
-| deep gemm+triton<sup>2</sup> | standard, batched | all<sup>1</sup> | G(128),A,T    | silu, gelu                                   | <sup>6</sup>          | Y       | [`TritonOrDeepGemmExperts`][vllm.model_executor.layers.fused_moe.triton_deep_gemm_moe.TritonOrDeepGemmExperts], [`BatchedTritonOrDeepGemmExperts`][vllm.model_executor.layers.fused_moe.batched_triton_or_deep_gemm_moe.BatchedTritonOrDeepGemmExperts] |
-| marlin                       | standard          | <sup>3</sup>    | <sup>3</sup>  | silu, swigluoai                              | Y                     | N       | [`fused_marlin_moe`][vllm.model_executor.layers.fused_moe.fused_marlin_moe.fused_marlin_moe]                                                                                                                         |
-| trtllm                       | standard          | mxfp4,nvfp4     | G(16),G32)    | <sup>5</sup>                                | N                     | Y       | [`TrtLlmGenExperts`][vllm.model_executor.layers.fused_moe.trtllm_moe.TrtLlmGenExperts]                                                                                                                               |
+| deep gemm                    | standard,</br>batched | fp8             | G(128),A,T    | silu, gelu                                   | <sup>6</sup>          | Y       | [`deep_gemm_moe_fp8`][vllm.model_executor.layers.fused_moe.deep_gemm_moe.deep_gemm_moe_fp8],</br>[`DeepGemmExperts`][vllm.model_executor.layers.fused_moe.deep_gemm_moe.DeepGemmExperts],</br>[`BatchedDeepGemmExperts`][vllm.model_executor.layers.fused_moe.batched_deep_gemm_moe.BatchedDeepGemmExperts]             |
+| cutlass_fp4                  | standard,</br>batched | nvfp4           | A,T           | silu                                        | Y                     | Y       | [`cutlass_moe_fp4`][vllm.model_executor.layers.fused_moe.cutlass_moe.cutlass_moe_fp4],</br>[`CutlassExpertsFp4`][vllm.model_executor.layers.fused_moe.cutlass_moe.CutlassExpertsFp4]                                                                                                          |
+| cutlass_fp8                  | standard,</br>batched | fp8             | A,T           | silu, gelu                                   | Y                     | Y       | [`cutlass_moe_fp8`][vllm.model_executor.layers.fused_moe.cutlass_moe.cutlass_moe_fp8],</br>[`CutlassExpertsFp8`][vllm.model_executor.layers.fused_moe.cutlass_moe.CutlassExpertsFp8],</br>[`CutlasBatchedExpertsFp8`][vllm.model_executor.layers.fused_moe.cutlass_moe.CutlassBatchedExpertsFp8]                                                                               |
+| flashinfer                   | standard          | nvfp4,</br>fp8       | T             | <sup>5</sup>                                | N                     | Y       | [`flashinfer_cutlass_moe_fp4`][vllm.model_executor.layers.fused_moe.flashinfer_cutlass_moe.flashinfer_cutlass_moe_fp4],</br>[`FlashInferExperts`][vllm.model_executor.layers.fused_moe.flashinfer_cutlass_moe.FlashInferExperts]                                                                                    |
+| gpt oss triton               | batched           | N/A             | N/A           | <sup>5</sup>                                | Y                     | Y       | [`triton_kernel_fused_experts`][vllm.model_executor.layers.fused_moe.gpt_oss_triton_kernels_moe.triton_kernel_fused_experts],</br>[`BatchedOAITritonExperts`][vllm.model_executor.layers.fused_moe.gpt_oss_triton_kernels_moe.BatchedOAITritonExperts]                                                                         |
+| deep gemm+triton<sup>2</sup> | standard,</br>batched | all<sup>1</sup> | G(128),A,T    | silu, gelu                                   | <sup>6</sup>          | Y       | [`TritonOrDeepGemmExperts`][vllm.model_executor.layers.fused_moe.triton_deep_gemm_moe.TritonOrDeepGemmExperts],</br>[`BatchedTritonOrDeepGemmExperts`][vllm.model_executor.layers.fused_moe.batched_triton_or_deep_gemm_moe.BatchedTritonOrDeepGemmExperts] |
+| marlin                       | standard          | <sup>3</sup>    | <sup>3</sup>  | silu,</br>swigluoai                              | Y                     | N       | [`fused_marlin_moe`][vllm.model_executor.layers.fused_moe.fused_marlin_moe.fused_marlin_moe]                                                                                                                         |
+| trtllm                       | standard          | mxfp4,</br>nvfp4     | G(16),G(32)    | <sup>5</sup>                                | N                     | Y       | [`TrtLlmGenExperts`][vllm.model_executor.layers.fused_moe.trtllm_moe.TrtLlmGenExperts]                                                                                                                               |
 | pallas                       | standard          | N/A             | N/A           | silu                                        | N                     | N       | [`fused_moe`][vllm.model_executor.layers.fused_moe.moe_pallas.fused_moe]                                                                                                                                      |
 | iterative                    | standard          | N/A             | N/A           | silu                                        | N                     | N       | [`fused_moe`][vllm.model_executor.layers.fused_moe.moe_torch_iterative.fused_moe]                                                                                                                             |
 | rocm aiter moe               | standard          | fp8             | G(128),A,T    | silu, gelu                                   | Y                     | N       | [`rocm_aiter_fused_experts`][vllm.model_executor.layers.fused_moe.rocm_aiter_fused_moe.rocm_aiter_fused_moe_impl]                                                                                                             |
 | cpu_fused_moe                | standard          | N/A             | N/A           | silu                                        | N                     | N       | [`CPUFusedMOE`][vllm.model_executor.layers.fused_moe.cpu_fused_moe.CPUFusedMOE]                                                                                                                                 |
-| naive batched<sup>4</sup>    | batched           | int8,fp8        | G,A,T         | silu, gelu                                   | <sup>6</sup>          | Y       | [`NaiveBatchedExperts`][vllm.model_executor.layers.fused_moe.fused_batched_moe.NaiveBatchedExperts]                                                                                                    |
+| naive batched<sup>4</sup>    | batched           | int8,</br>fp8        | G,A,T         | silu, gelu                                   | <sup>6</sup>          | Y       | [`NaiveBatchedExperts`][vllm.model_executor.layers.fused_moe.fused_batched_moe.NaiveBatchedExperts]                                                                                                    |
 
-1. All types: mxfp4, nvfp4, int4, int8, fp8
-2. A dispatcher wrapper around triton and deep gemm experts.  Will select based on type + shape + quantization params
-3. uint4, uint8, fp8, fp4
-4. This is a naive implementation of experts that supports batched format. Mainly used for testing.
-5. The `activation` parameter is ignored and SwiGlu is used by default instead.
-6. Only handled by or supported when used with modular kernels.
+!!! info "Table key"
+    1. All types: mxfp4, nvfp4, int4, int8, fp8
+    2. A dispatcher wrapper around triton and deep gemm experts.  Will select based on type + shape + quantization params
+    3. uint4, uint8, fp8, fp4
+    4. This is a naive implementation of experts that supports batched format. Mainly used for testing.
+    5. The `activation` parameter is ignored and SwiGlu is used by default instead.
+    6. Only handled by or supported when used with modular kernels.
 
 ## Modular Kernel "families"
 
@@ -122,6 +113,6 @@ The following table shows "families" of modular kernels that are intended to wor
 
 | backend                      | `FusedMoEPrepareAndFinalize` subclasses                | `FusedMoEPermuteExpertsUnpermute` subclasses                                                                   |
 |------------------------------|--------------------------------------------------------|----------------------------------------------------------------------------------------------------------------|
-| deepep_high_throughput, pplx | `DeepEPHTPrepareAndFinalize`, `PplxPrepareAndFinalize` | `BatchedDeepGemmExperts`, `BatchedTritonExperts`, `BatchedTritonOrDeepGemmExperts`, `CutlassBatchedExpertsFp8` |
-| deepep_low_latency           | `DeepEPLLPrepareAndFinalize`                           | `DeepGemmExperts`, `TritonExperts`, `TritonOrDeepGemmExperts`, `CutlassExpertsFp8`                             |
+| deepep_high_throughput,</br>pplx | `DeepEPHTPrepareAndFinalize`,</br>`PplxPrepareAndFinalize` | `BatchedDeepGemmExperts`,</br>`BatchedTritonExperts`,</br>`BatchedTritonOrDeepGemmExperts`,</br>`CutlassBatchedExpertsFp8` |
+| deepep_low_latency           | `DeepEPLLPrepareAndFinalize`                           | `DeepGemmExperts`,</br>`TritonExperts`,</br>`TritonOrDeepGemmExperts`,</br>`CutlassExpertsFp8`                             |
 | flashinfer                   | `FlashInferCutlassMoEPrepareAndFinalize`               | `FlashInferExperts`                                                                                            |

Signed-off-by: Bill Nell <[email protected]>
Signed-off-by: Bill Nell <[email protected]>
Signed-off-by: Bill Nell <[email protected]>
Signed-off-by: Bill Nell <[email protected]>
Signed-off-by: Bill Nell <[email protected]>
Signed-off-by: Bill Nell <[email protected]>
Signed-off-by: Bill Nell <[email protected]>
Signed-off-by: Bill Nell <[email protected]>
Signed-off-by: Bill Nell <[email protected]>
Signed-off-by: Bill Nell <[email protected]>
Signed-off-by: Bill Nell <[email protected]>
Signed-off-by: Bill Nell <[email protected]>
Signed-off-by: Bill Nell <[email protected]>
Signed-off-by: Bill Nell <[email protected]>
Signed-off-by: Bill Nell <[email protected]>
Signed-off-by: Bill Nell <[email protected]>
Signed-off-by: Bill Nell <[email protected]>
Signed-off-by: Bill Nell <[email protected]>
Signed-off-by: Bill Nell <[email protected]>
@bnellnm bnellnm requested a review from hmellor September 29, 2025 20:44
Copy link
Member

@hmellor hmellor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Just a couple of tiny nits

bnellnm and others added 2 commits September 30, 2025 13:48
Co-authored-by: Harry Mellor <[email protected]>
Signed-off-by: bnellnm <[email protected]>
Co-authored-by: Harry Mellor <[email protected]>
Signed-off-by: bnellnm <[email protected]>
@hmellor hmellor enabled auto-merge (squash) September 30, 2025 18:34
@github-actions github-actions bot added the ready ONLY add when PR is ready to merge/full CI is needed label Sep 30, 2025
@hmellor hmellor merged commit fb610ae into vllm-project:main Sep 30, 2025
11 checks passed
@bnellnm bnellnm deleted the support-doc branch September 30, 2025 19:36
pdasigi pushed a commit to pdasigi/vllm that referenced this pull request Oct 2, 2025
Signed-off-by: Bill Nell <[email protected]>
Signed-off-by: bnellnm <[email protected]>
Co-authored-by: Harry Mellor <[email protected]>
yewentao256 pushed a commit that referenced this pull request Oct 3, 2025
Signed-off-by: Bill Nell <[email protected]>
Signed-off-by: bnellnm <[email protected]>
Co-authored-by: Harry Mellor <[email protected]>
Signed-off-by: yewentao256 <[email protected]>
tomeras91 pushed a commit to tomeras91/vllm that referenced this pull request Oct 6, 2025
Signed-off-by: Bill Nell <[email protected]>
Signed-off-by: bnellnm <[email protected]>
Co-authored-by: Harry Mellor <[email protected]>
Signed-off-by: Tomer Asida <[email protected]>
xuebwang-amd pushed a commit to xuebwang-amd/vllm that referenced this pull request Oct 10, 2025
Signed-off-by: Bill Nell <[email protected]>
Signed-off-by: bnellnm <[email protected]>
Co-authored-by: Harry Mellor <[email protected]>
Signed-off-by: xuebwang-amd <[email protected]>
lywa1998 pushed a commit to lywa1998/vllm that referenced this pull request Oct 20, 2025
Signed-off-by: Bill Nell <[email protected]>
Signed-off-by: bnellnm <[email protected]>
Co-authored-by: Harry Mellor <[email protected]>
alhridoy pushed a commit to alhridoy/vllm that referenced this pull request Oct 24, 2025
Signed-off-by: Bill Nell <[email protected]>
Signed-off-by: bnellnm <[email protected]>
Co-authored-by: Harry Mellor <[email protected]>
xuebwang-amd pushed a commit to xuebwang-amd/vllm that referenced this pull request Oct 24, 2025
Signed-off-by: Bill Nell <[email protected]>
Signed-off-by: bnellnm <[email protected]>
Co-authored-by: Harry Mellor <[email protected]>
Signed-off-by: xuebwang-amd <[email protected]>
rtourgeman pushed a commit to rtourgeman/vllm that referenced this pull request Nov 10, 2025
Signed-off-by: Bill Nell <[email protected]>
Signed-off-by: bnellnm <[email protected]>
Co-authored-by: Harry Mellor <[email protected]>
devpatelio pushed a commit to SumanthRH/vllm that referenced this pull request Nov 29, 2025
Signed-off-by: Bill Nell <[email protected]>
Signed-off-by: bnellnm <[email protected]>
Co-authored-by: Harry Mellor <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Improvements or additions to documentation ready ONLY add when PR is ready to merge/full CI is needed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants