Can't launch QwQ-32B with 'Parsing Reasoning Content' switch on. #3023

tacnaci · 2025-03-10T02:48:51Z

cuda: 12.4, vllm 0.7.3, python 3.11.11, ubuntu 22.04.5

1.3.1

xinference-local --host 0.0.0.0 --port 9997 --auth-config auth_config.json

It seems that no reason_parser has been registered for QwQ, causing a startup error.

When the 'Parsing Reasoning Content' switch is turned on, QwQ-32B can start normally using vllm.

qinxuye · 2025-03-10T03:05:02Z

This is a known issue and will be fixed soon. @amumu96

XprobeBot added the gpu label Mar 10, 2025

XprobeBot added this to the v1.x milestone Mar 10, 2025

qinxuye linked a pull request Mar 10, 2025 that will close this issue

BUG: Fix reasoning content parser for qwq-32b #3024

Open

Provide feedback