Skip to content

[spec] fix dp-attn + spec + trtllm-mla#14396

Closed
hnyls2002 wants to merge 5 commits intomainfrom
lsyin/fix-dp-spec-v2
Closed

[spec] fix dp-attn + spec + trtllm-mla#14396
hnyls2002 wants to merge 5 commits intomainfrom
lsyin/fix-dp-spec-v2

Conversation

@hnyls2002
Copy link
Collaborator

@hnyls2002 hnyls2002 commented Dec 4, 2025

To completely fix the issue, it requires

  • Move most of the metadata sync (in prepare_mlp_sync_batch) into Scheduler.prepare_mlp_sync_batch.
  • Maybe deprecate the API skip_attn_backend_init.
  • Refactor ForwardMode into more reliable status representation.

@gemini-code-assist
Copy link
Contributor

Warning

You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again!

@github-actions github-actions bot added the blackwell SM100/SM120 label Dec 4, 2025
@hnyls2002
Copy link
Collaborator Author

/tag-and-rerun-ci

@github-actions github-actions bot added the run-ci label Dec 4, 2025
@Fridge003
Copy link
Collaborator

cc @rainj-me

dtype=q.dtype,
device=q.device,
)
q = self.pad_draft_extend_query(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After removing the pad_draft_extend_query call, maybe we should already delete the relate code.

# Reshape output directly without slicing

if forward_batch.forward_mode.is_draft_extend(include_v2=True):
raw_out = self.unpad_draft_extend_output(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here.

else:
draft_logits_output, _ = self.draft_runner.forward(
forward_batch, skip_attn_backend_init=True
forward_batch, skip_attn_backend_init=False
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

skip attn backend lead to the flashinfer backend cudagraph capture failure.

@hnyls2002
Copy link
Collaborator Author

This PR is closed due to #13115

@hnyls2002 hnyls2002 closed this Dec 7, 2025
@hnyls2002 hnyls2002 deleted the lsyin/fix-dp-spec-v2 branch December 7, 2025 13:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants