Skip to content
2 changes: 1 addition & 1 deletion benchmarks/single_node/dsr1_fp8_mi355x.sh
Original file line number Diff line number Diff line change
Expand Up @@ -44,7 +44,7 @@ python3 -m sglang.launch_server \
--trust-remote-code \
--chunked-prefill-size 196608 \
--mem-fraction-static 0.8 --disable-radix-cache \
--num-continuous-decode-steps 4 \
--num-continuous-decode-steps 8 \
--max-prefill-tokens 196608 \
--kv-cache-dtype fp8_e4m3 \
--cuda-graph-max-bs "$CONC" $EVAL_CONTEXT_ARGS > $SERVER_LOG 2>&1 &
Expand Down
6 changes: 6 additions & 0 deletions perf-changelog.yaml
Original file line number Diff line number Diff line change
@@ -1,3 +1,9 @@
- config-keys:
- dsr1-fp8-mi355x-sglang
description:
- "Tune --num-continuous-decode-steps 4 → 8 (+4.7% avg output throughput gain)"
pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/1109

Check failure on line 6 in perf-changelog.yaml

View check run for this annotation

Claude / Claude Code Review

perf-changelog.yaml entry has wrong pr-link and wrong location

The new perf-changelog.yaml entry has two issues: (1) **pr-link points to the wrong PR** — it references #1109 (an unrelated older PR) instead of this PR (#1243); (2) **the entry is prepended to the top of the file** instead of appended to the end, violating the explicit rule in AGENTS.md (line 161): "New entries MUST be appended to the END of the file — never insert in the middle or prepend." Please update the pr-link to https://github.com/SemiAnalysisAI/InferenceX/pull/1243 and move the new YA
Comment thread
lishuoshuo-amd marked this conversation as resolved.
Outdated
- config-keys:
- 70b-fp8-*-vllm
description:
Expand Down
Loading