Skip to content

Conversation

@ooooo-create
Copy link
Contributor

add unit tests for masked_per_token_quant

@paddle-bot
Copy link

paddle-bot bot commented Sep 15, 2025

Thanks for your contribution!

Copy link
Collaborator

@RichardWooSJTU RichardWooSJTU left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for contribution!

from fastdeploy.model_executor.ops.gpu import masked_per_token_quant


def masked_per_token_quant_paddle(input_tensor, recv_expert_count, block_size):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rename to masked_per_token_quant_ref

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thx, 已修改

self.recv_expert_count = paddle.to_tensor([3, 2], dtype="int32")

# Get reference results from paddle implementation
self.quanted_x_paddle, self.quanted_scale_paddle = masked_per_token_quant_paddle(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

renmae _paddle to _ref

Copy link
Collaborator

@RichardWooSJTU RichardWooSJTU left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@RichardWooSJTU RichardWooSJTU merged commit 2d64107 into PaddlePaddle:develop Oct 13, 2025
25 of 28 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants