[Doc][Misc] Fix msprobe_guide.md documentation issues by NJX-njx · Pull Request #6965 · vllm-project/vllm-ascend

NJX-njx · 2026-03-03T11:54:53Z

What this PR does / why we need it?

Fixes several documentation issues in the msprobe debugging guide as reported in #6065:

Remove unnecessary cat heredoc wrapper: The example configuration section used a cat <<'JSON' bash wrapper around the JSON config. Simplified to a plain JSON code block.
Fix duplicate chapter numbering: Two sections were both numbered '2'. Renumbered sections sequentially (0-6).
Fix msprobe command: Changed msprobe graph_visualize to msprobe -f pytorch graph in section 5.2 Visualization.
Remove backward-related content: Since vllm is inference-only (no training), removed all backward pass references including backward tensor examples, parameter gradient examples, and backward descriptions from dump.json explanations.

Does this PR introduce any user-facing change?

Documentation improvement only. No code changes.

How was this patch tested?

Manual review of the markdown file to verify all 4 issues from #6065 are addressed.

Closes #6065

vLLM version: v0.16.0
vLLM main: vllm-project/vllm@15d76f7

- Remove unnecessary cat heredoc wrapper in example configuration - Fix duplicate chapter numbering (two sections numbered '2') - Fix msprobe command from 'msprobe graph_visualize' to 'msprobe -f pytorch graph' - Remove backward-related content since vllm is inference-only Closes vllm-project#6065 Signed-off-by: NJX-njx <3771829673@qq.com>

chatgpt-codex-connector · 2026-03-03T11:54:58Z

You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard.

gemini-code-assist · 2026-03-03T11:55:06Z

Summary of Changes

Hello, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request focuses on enhancing the clarity and accuracy of the msprobe_guide.md documentation. The changes streamline code examples, fix structural inconsistencies like duplicate numbering, update command syntax to reflect current usage, and remove content related to backward passes that is irrelevant for vLLM's inference-only context. These improvements ensure the guide is more precise and easier to follow for users.

Highlights

Documentation Formatting: Removed an unnecessary 'cat heredoc' bash wrapper from a JSON configuration example, simplifying the presentation of code blocks.
Section Renumbering: Corrected duplicate chapter numbering within the document, ensuring sequential numbering for all sections.
Command Correction: Updated the 'msprobe graph_visualize' command to the correct 'msprobe -f pytorch graph' in the visualization section.
Backward Pass Content Removal: Eliminated all references to backward passes, including backward tensor examples, parameter gradients, and backward descriptions from dump.json explanations, aligning the documentation with vLLM's inference-only nature.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Changelog

docs/source/developer_guide/performance_and_debug/msprobe_guide.md
- Updated example scope configurations to consistently use 'forward.0' instead of 'backward.0'.
- Modified the list example to use 'Torch.relu.3.forward' instead of 'Torch.relu.3.backward'.
- Removed the 'cat <<'JSON'' bash wrapper from the example configuration, directly presenting the JSON block.
- Renumbered sections '2. Enable msprobe in vllm-ascend' to '3.', '3. Send requests and collect dumps' to '4.', '4. Analyze the results' to '5.', and '5. Troubleshooting' to '6.'.
- Adjusted the description of the msprobe sequence to exclude 'backward', reflecting vLLM's inference-only nature.
- Removed backward-related tensor examples and descriptions from the dump_tensor_data section.
- Updated the msprobe graph_visualize command to msprobe -f pytorch graph in the visualization section and its subsequent mentions.
- Removed descriptions and examples related to backward passes and parameter gradients from the L0 and L1 dump.json explanations.

Activity

No human activity has been recorded on this pull request since its creation.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request provides valuable updates to the msprobe_guide.md documentation, including fixing section numbering, updating the msprobe command, and removing obsolete content related to the backward pass. These changes improve the accuracy and clarity of the guide. I've identified two instances in the JSON examples where a trailing comma was left after removing other properties, which makes the JSON invalid. I've included suggestions to correct these.

gemini-code-assist · 2026-03-03T11:56:25Z

@@ -339,89 +333,16 @@ An L0 `dump.json` contains forward/backward I/O for modules together with parame
    }
   }
  },


This JSON example is invalid due to a trailing comma. Since the Module.conv2.Conv2d.parameters_grad and Module.conv2.Conv2d.backward.0 properties were removed from the data object, this comma after the Module.conv2.Conv2d.forward.0 object is no longer necessary and makes the JSON invalid.

Suggested change

},

}

gemini-code-assist · 2026-03-03T11:56:25Z

@@ -469,43 +390,6 @@ An L1 `dump.json` records forward/backward I/O for APIs. Using PyTorch's `relu`
    }
   ]
  },


This JSON example is invalid due to a trailing comma. Since the Functional.relu.0.backward property was removed from the data object, this comma after the Functional.relu.0.forward object is no longer necessary and makes the JSON invalid.

Suggested change

},

}

github-actions · 2026-03-03T11:57:10Z

👋 Hi! Thank you for contributing to the vLLM Ascend project. The following points will speed up your PR merge:‌‌

A PR should do only one thing, smaller PRs enable faster reviews.
Every PR should include unit tests and end-to-end tests ‌to ensure it works and is not broken by other future PRs.
Write the commit message by fulfilling the PR description to help reviewer and future developers understand.

If CI fails, you can run linting and testing checks locally according Contributing and Testing.

Copilot

Pull request overview

Updates the MSProbe debugging guide to resolve reported documentation issues and better align the workflow with vLLM’s inference-only usage.

Changes:

Simplifies the example configuration section by removing the bash heredoc wrapper in favor of a plain JSON block.
Fixes section numbering and updates the visualization command to msprobe -f pytorch graph.
Removes backward/gradient-related references from the guide and dump.json explanations.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-03-03T12:03:17Z

  ```json
  "scope": ["Module.conv1.Conv2d.forward.0", "Module.fc2.Linear.forward.0"]
-  "scope": ["Cell.conv1.Conv2d.forward.0", "Cell.fc2.Dense.backward.0"]
+  "scope": ["Cell.conv1.Conv2d.forward.0", "Cell.fc2.Dense.forward.0"]
  "scope": ["Tensor.add.0.forward", "Functional.square.2.forward"]
  ```


The scope examples are inside a json code block but repeat the "scope" key three times, which is not valid JSON and may confuse readers trying to copy/paste. Consider turning these into separate code blocks (each showing a full snippet) or a single valid example with explanatory text outside the JSON block.

Copilot · 2026-03-03T12:03:18Z

@@ -86,10 +85,9 @@ cat <<'JSON' > /data/msprobe_config.json
    "summary_mode": "statistics"
  }
 }
-JSON
 ```


After removing the heredoc example, the guide no longer shows what filename/path to save this JSON config to, but the later vllm serve command still references /data/msprobe_config.json. Please add a short instruction (e.g., “save as /data/msprobe_config.json”) or update the command path so the example is self-contained and consistent.

Copilot · 2026-03-03T12:03:18Z

   │   │   │   ├── dump_tensor_data
-   │   │   │   │    ├── Tensor.permute.1.forward.pt
-   │   │   │   │    ├── Functional.linear.5.backward.output.pt    # Format: {api_type}.{api_name}.{call_count}.{forward/backward}.{input/output}.{arg_index}.
+   │   │   │   │    ├── Tensor.permute.1.forward.pt                       # Format: {api_type}.{api_name}.{call_count}.forward.{input/output}.{arg_index}.


The example filename Tensor.permute.1.forward.pt does not match the documented format {api_type}.{api_name}.{call_count}.forward.{input/output}.{arg_index} (it’s missing the {input/output}.{arg_index} segments). Either update the example filename(s) to match the stated format, or adjust the format description to reflect the actual naming.

Suggested change

│ │ │ │ ├── Tensor.permute.1.forward.pt # Format: {api_type}.{api_name}.{call_count}.forward.{input/output}.{arg_index}.

│ │ │ │ ├── Tensor.permute.1.forward.input.0.pt # Format: {api_type}.{api_name}.{call_count}.forward.{input/output}.{arg_index}.

Copilot · 2026-03-03T12:03:18Z

   }
  },
-  "Module.conv2.Conv2d.parameters_grad": {
-   "weight": [
-    {
-     "type": "torch.Tensor",
-     "dtype": "torch.float32",
-     "shape": [
-      32,
-      16,
-      5,
-      5
-     ],
-     "Max": 0.018550323322415352,
-     "Min": -0.008627401664853096,
-     "Mean": 0.0006675920449197292,
-     "Norm": 0.26084786653518677,
-     "requires_grad": false,
-     "data_name": "Module.conv2.Conv2d.parameters_grad.weight.pt"
-    }
-   ],
-   "bias": [
-    {
-     "type": "torch.Tensor",
-     "dtype": "torch.float32",
-     "shape": [
-      32
-     ],
-     "Max": 0.014914230443537235,
-     "Min": -0.006656786892563105,
-     "Mean": 0.002657240955159068,
-     "Norm": 0.029451673850417137,
-     "requires_grad": false,
-     "data_name": "Module.conv2.Conv2d.parameters_grad.bias.pt"
-    }
-   ]
-  },
-  "Module.conv2.Conv2d.backward.0": {
-   "input": [
-    {
-     "type": "torch.Tensor",
-     "dtype": "torch.float32",
-     "shape": [
-      8,
-      32,
-      10,
-      10
-     ],
-     "Max": 0.0015069986693561077,
-     "Min": -0.001139344065450132,
-     "Mean": 3.3215508210560074e-06,
-     "Norm": 0.020567523315548897,
-     "requires_grad": false,
-     "data_name": "Module.conv2.Conv2d.backward.0.input.0.pt"
-    }
-   ],
-   "output": [
-    {
-     "type": "torch.Tensor",
-     "dtype": "torch.float32",
-     "shape": [
-      8,
-      16,
-      14,
-      14
-     ],
-     "Max": 0.0007466732058674097,
-     "Min": -0.00044813455315306783,
-     "Mean": 6.814070275140693e-06,
-     "Norm": 0.01474067009985447,
-     "requires_grad": false,
-     "data_name": "Module.conv2.Conv2d.backward.0.output.0.pt"
-    }
-   ]
  }
 }
 }


The L0 dump.json example is no longer valid JSON after removing the parameters_grad/backward entries: it now ends with a trailing comma and mismatched closing braces (}, then } …). Please update the closing braces (and remove the trailing comma) so the snippet is syntactically correct and copy/pasteable.

Copilot · 2026-03-03T12:03:18Z

   ]
  },
-  "Functional.relu.0.backward": {
-   "input": [
-    {
-     "type": "torch.Tensor",
-     "dtype": "torch.float32",
-     "shape": [
-      32,
-      16,
-      28,
-      28
-     ],
-     "Max": 0.0001815402356442064,
-     "Min": -0.00013352684618439525,
-     "Mean": 0.00011915402356442064,
-     "Norm": 0.007598237134516239,
-     "requires_grad": false,
-     "data_name": "Functional.relu.0.backward.input.0.pt"
-    }
-   ],
-   "output": [
-    {
-     "type": "torch.Tensor",
-     "dtype": "torch.float32",
-     "shape": [
-      32,
-      16,
-      28,
-      28
-     ],
-     "Max": 0.0001815402356442064,
-     "Min": -0.00012117840378778055,
-     "Mean": 2.0098118724831693e-08,
-     "Norm": 0.006532244384288788,
-     "requires_grad": false,
-     "data_name": "Functional.relu.0.backward.output.0.pt"
-    }
-   ]
  }
 }
 }  


The L1 dump.json example is now invalid JSON after removing the .backward section: it leaves a trailing comma after the Functional.relu.0.forward object and has extra closing braces. Please adjust the snippet so the data object and root object close cleanly without trailing commas.

wangxiyuan · 2026-03-04T02:28:37Z

Thank you

…to qwen3next_graph * 'main' of https://github.com/vllm-project/vllm-ascend: (40 commits) [Feature] Add docs of batch invariance and make some extra operators patch (vllm-project#6910) [bugfix]Qwen2.5VL accurate question (vllm-project#6975) [CI] Add DeepSeek-V3.2 large EP nightly ci (vllm-project#6378) [Ops][BugFix] Fix RoPE shape mismatch for mtp models with flashcomm v1 enabled (vllm-project#6939) [bugfix]fix file not found error in nightly of single-node (vllm-project#6976) [Bugfix] Fix the acceptance rates dorp issue when applying eagle3 to QuaRot model (vllm-project#6914) [CI] Enable auto upgrade e2e estimated time for auto-partition suites (vllm-project#6840) [Doc][Misc] Fix msprobe_guide.md documentation issues (vllm-project#6965) [Nightly][Refactor]Migrate nightly single-node model tests from `.py` to `.yaml` (vllm-project#6503) [BugFix] Improve GDN layer detection for multimodal models (vllm-project#6941) [feat]ds3.2 pcp support mtp and chunkprefill (vllm-project#6917) [CPU binding] Implement global CPU slicing and improve IRQ binding for Ascend NPUs (vllm-project#6945) [Triton] Centralize Ascend extension op dispatch in triton_utils (vllm-project#6937) [csrc][bugfix] Add compile-time Ascend950/910_95 compatibility for custom ops between CANN8.5 and 9.0 (vllm-project#6936) [300I][Bugfix] fix unquant model weight nd2nz error (vllm-project#6851) [doc] fix supported_models (vllm-project#6930) [CI] nightly test timeout (vllm-project#6912) [CI] Upgrade CANN to 8.5.1 (vllm-project#6897) [Model]Add Qwen3-Omni quantization Ascend NPU adaptation and optimization (vllm-project#6828) [P/D][v0.16.0]Adapt to RecomputeScheduler in vLLM 0.16.0 (vllm-project#6898) ...

) ## What this PR does / why we need it? Fixes several documentation issues in the msprobe debugging guide as reported in vllm-project#6065: 1. **Remove unnecessary `cat` heredoc wrapper**: The example configuration section used a `cat <<'JSON'` bash wrapper around the JSON config. Simplified to a plain JSON code block. 2. **Fix duplicate chapter numbering**: Two sections were both numbered '2'. Renumbered sections sequentially (0-6). 3. **Fix msprobe command**: Changed `msprobe graph_visualize` to `msprobe -f pytorch graph` in section 5.2 Visualization. 4. **Remove backward-related content**: Since vllm is inference-only (no training), removed all backward pass references including backward tensor examples, parameter gradient examples, and backward descriptions from dump.json explanations. ## Does this PR introduce _any_ user-facing change? Documentation improvement only. No code changes. ## How was this patch tested? Manual review of the markdown file to verify all 4 issues from vllm-project#6065 are addressed. Closes vllm-project#6065 - vLLM version: v0.16.0 - vLLM main: vllm-project/vllm@15d76f7 Signed-off-by: NJX-njx <3771829673@qq.com>

Copilot AI review requested due to automatic review settings March 3, 2026 11:54

NJX-njx requested review from LCAIZJ, Yikun and wangxiyuan as code owners March 3, 2026 11:54

gemini-code-assist bot reviewed Mar 3, 2026

View reviewed changes

github-actions bot added the documentation Improvements or additions to documentation label Mar 3, 2026

Copilot started reviewing on behalf of NJX-njx March 3, 2026 11:58 View session

Copilot AI reviewed Mar 3, 2026

View reviewed changes

wangxiyuan approved these changes Mar 4, 2026

View reviewed changes

wangxiyuan merged commit c7fd7a2 into vllm-project:main Mar 4, 2026
11 of 12 checks passed

	│ │ │ │ ├── Tensor.permute.1.forward.pt # Format: {api_type}.{api_name}.{call_count}.forward.{input/output}.{arg_index}.
	│ │ │ │ ├── Tensor.permute.1.forward.input.0.pt # Format: {api_type}.{api_name}.{call_count}.forward.{input/output}.{arg_index}.

Conversation

NJX-njx commented Mar 3, 2026 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What this PR does / why we need it?

Does this PR introduce any user-facing change?

How was this patch tested?

Uh oh!

chatgpt-codex-connector bot commented Mar 3, 2026

Uh oh!

gemini-code-assist bot commented Mar 3, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Mar 3, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Mar 3, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Mar 3, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI Mar 3, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 3, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 3, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 3, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 3, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

wangxiyuan commented Mar 4, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

NJX-njx commented Mar 3, 2026 •

edited by github-actions bot

Loading