-
Notifications
You must be signed in to change notification settings - Fork 660
【Hackathon 9th No.20】add unit tests for masked_per_token_quant #4111
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
【Hackathon 9th No.20】add unit tests for masked_per_token_quant #4111
Conversation
|
Thanks for your contribution! |
RichardWooSJTU
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for contribution!
| from fastdeploy.model_executor.ops.gpu import masked_per_token_quant | ||
|
|
||
|
|
||
| def masked_per_token_quant_paddle(input_tensor, recv_expert_count, block_size): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rename to masked_per_token_quant_ref
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thx, 已修改
| self.recv_expert_count = paddle.to_tensor([3, 2], dtype="int32") | ||
|
|
||
| # Get reference results from paddle implementation | ||
| self.quanted_x_paddle, self.quanted_scale_paddle = masked_per_token_quant_paddle( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
renmae _paddle to _ref
…create/FastDeploy into ut_masked_per_token_quant
RichardWooSJTU
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
add unit tests for masked_per_token_quant