feat(qwen3.6): add Qwen3.6-35B-A3B-NVFP4 + MTP Atlas recipe by tbraun96 · Pull Request #2 · Avarok-Cybersecurity/atlas-recipes

tbraun96 · 2026-05-10T20:15:41Z

Summary

Adds the Atlas recipe for RedHatAI/Qwen3.6-35B-A3B-NVFP4 with the upstream-bundled MTP K=2 draft head. Mirrors the existing qwen3.5-35b-a3b-nvfp4 recipe; only the model id changes (qwen3_5_moe arch is identical) and max_model_len is bumped to 131072 so the full spark-arena-v2 depth sweep up to 100k fits.

Result (live-validated on a single GB10)

sparkrun benchmark run recipes/qwen3.6/qwen3.6-35b-a3b-nvfp4-atlas.yaml --profile spark-arena-v2 --port 8888

Decode @ tg=128, pp=2048, depth=0, concurrency=1: 214.59 tok/s (4.66 ms TPOT)
vs current Spark Arena leaderboard #6 (Qwen3.6-35B-A3B-NVFP4 on vLLM): 77.07 tok/s → 2.78× speedup
25/28 cells in the heat-aware schedule produced clean arena-valid run JSONs
3 cells (d=4096,c=1 etc.) hit an Atlas-side hang on `avarok/atlas-gb10:latest`; recoverable via `sparkrun benchmark resume bench_b9efdca5fa68`

Reproducibility

Image: `avarok/atlas-gb10:latest` (stock public)
Model: `RedHatAI/Qwen3.6-35B-A3B-NVFP4` (stock public)
Sparkrun: v0.2.31 (PyPI)
No engine patches required

Test plan

`sparkrun show recipes/qwen3.6/qwen3.6-35b-a3b-nvfp4-atlas.yaml` parses
`sparkrun benchmark run … --profile spark-arena-v2` produces consolidated.json with 25 cells
Result beats every NVFP4 entry on the current Spark Arena leaderboard

🤖 Generated with Claude Code

Mirrors the qwen3.5-35b-a3b-nvfp4 recipe but uses RedHatAI's NVFP4 quantization of Qwen3.6-35B-A3B (model_type=qwen3_5_moe). MTP K=2 draft head retained from upstream; KV cache stays NVFP4 for the full quant pipeline. Live-validated via `sparkrun benchmark run --profile spark-arena-v2` on a single GB10: 25/28 cells produced clean arena-valid JSON in /workspace/.cache/sparkrun/benchmarks/bench_b9efdca5fa68/. Headline: c=1 d=0 decode = 214.6 tok/s (4.66 ms TPOT), ~4x faster than the same-size Qwen3.6-35B-A3B-FP8 vLLM+MTP path. Co-Authored-By: Azeez Ishaqui <debaterishaqui@gmail.com>

AzeezIsh merged commit 3c9161a into main May 11, 2026
1 check passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(qwen3.6): add Qwen3.6-35B-A3B-NVFP4 + MTP Atlas recipe#2

feat(qwen3.6): add Qwen3.6-35B-A3B-NVFP4 + MTP Atlas recipe#2
AzeezIsh merged 1 commit into
mainfrom
feat/qwen3.6-35b-nvfp4

tbraun96 commented May 10, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

tbraun96 commented May 10, 2026

Summary

Result (live-validated on a single GB10)

Reproducibility

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants