Skip to content

Commit 0050f49

Browse files
nv-guomingzdominicshanshan
authored andcommitted
[None][doc] add blackwell information into support matrix (NVIDIA#6740)
Signed-off-by: nv-guomingz <[email protected]> Signed-off-by: Wangshanshan <[email protected]>
1 parent 4c5faa5 commit 0050f49

File tree

2 files changed

+5
-2
lines changed

2 files changed

+5
-2
lines changed

docs/source/legacy/reference/support-matrix.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -157,6 +157,7 @@ The following table shows the supported software for TensorRT-LLM.
157157
- [10.11](https://docs.nvidia.com/deeplearning/tensorrt/release-notes/index.html)
158158
* - Precision
159159
-
160+
- Blackwell (SM100/SM120) - FP32, FP16, BF16, FP8, FP4, INT8, INT4
160161
- Hopper (SM90) - FP32, FP16, BF16, FP8, INT8, INT4
161162
- Ada Lovelace (SM89) - FP32, FP16, BF16, FP8, INT8, INT4
162163
- Ampere (SM80, SM86) - FP32, FP16, BF16, INT8, INT4[^smgte89]

docs/source/overview.md

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -25,8 +25,10 @@ TensorRT LLM delivers breakthrough performance on the latest NVIDIA GPUs:
2525

2626
TensorRT LLM supports the latest and most popular LLM architectures:
2727

28-
- **Language Models**: GPT-OSS, Deepseek-R1/V3, Llama 3/4, Qwen2/3, Gemma 3, Phi 4...
29-
- **Multi-modal Models**: LLaVA-NeXT, Qwen2-VL, VILA, Llama 3.2 Vision...
28+
### FP4 Support
29+
[NVIDIA B200 GPUs](https://www.nvidia.com/en-us/data-center/dgx-b200/) , when used with TensorRT-LLM, enable seamless loading of model weights in the new [FP4 format](https://developer.nvidia.com/blog/introducing-nvfp4-for-efficient-and-accurate-low-precision-inference/#what_is_nvfp4), allowing you to automatically leverage optimized FP4 kernels for efficient and accurate low-precision inference.
30+
31+
### FP8 Support
3032

3133
TensorRT LLM strives to support the most popular models on **Day 0**.
3234

0 commit comments

Comments
 (0)