[Inference] Refine global search optimization for cuBLASLt and apply it in INT8 GEMM.#65597
Merged
ming1753 merged 2 commits intoPaddlePaddle:developfrom Jul 4, 2024
Merged
[Inference] Refine global search optimization for cuBLASLt and apply it in INT8 GEMM.#65597ming1753 merged 2 commits intoPaddlePaddle:developfrom
ming1753 merged 2 commits intoPaddlePaddle:developfrom
Conversation
|
你的PR提交成功,感谢你对开源项目的贡献! |
| int repeats = search_times_; | ||
|
|
||
| for (int loop = 0; loop < repeats; loop++) { | ||
| status = dynload::cublasLtMatmul(handle, |
Contributor
There was a problem hiding this comment.
在正式计时之前,是否应该先warmup?
Contributor
Author
There was a problem hiding this comment.
实测有没有warmup对最终的性能无影响
| } | ||
|
|
||
| template <typename InT, typename OutT> | ||
| void TestMatmulRun(cublasLtHandle_t handle, |
Contributor
Author
There was a problem hiding this comment.
已经改为RunAndMeasureAlgo
yuanlehome
reviewed
Jul 2, 2024
| @@ -0,0 +1,703 @@ | |||
| /* Copyright (c) 2023 PaddlePaddle Authors. All Rights Reserved. | |||
Comment on lines
15
to
17
| #pragma once | ||
|
|
||
| #pragma once |
|
|
||
| namespace phi { | ||
| namespace funcs { | ||
| namespace cublaslt_internal { |
Contributor
There was a problem hiding this comment.
有必要新增一个cublaslt_internal吗?
Contributor
Author
There was a problem hiding this comment.
这个功能默认不开启,而且不计划对外暴露,添加一个namesapce标识下也没问题吧
Comment on lines
481
to
496
| TestMatmulRun(handle, | ||
| matmul_desc, | ||
| a_desc, | ||
| b_desc, | ||
| bias_desc, | ||
| c_desc, | ||
| alpha, | ||
| beta, | ||
| a, | ||
| b, | ||
| bias, | ||
| c, | ||
| params[i], | ||
| start_event, | ||
| stop_event, | ||
| stream); |
Contributor
There was a problem hiding this comment.
函数里面有判断失败的情况,这里却没有任何利用的逻辑,不妨TestMatmulRun返回一个bool类型表示是否失败,这里有对应的处理逻辑
Contributor
Author
There was a problem hiding this comment.
处理在函数内部,判断失败了之后time记为max
yuanlehome
approved these changes
Jul 4, 2024
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
PR Category
Inference
PR Types
New features
Description
Pcard-71500
将fp8中的cublaslt矩阵乘法全局搜索算法抽离至一个头文件,并同时应用于fp8和int8的matmul计算。
新增flag: FLAGS_enable_blaslt_global_search,默认false,关闭功能。
开启后会在计算int8 matmul时启用cuBLASLt全局搜索找寻最优kernel并缓存至“./paddle_cublaslt_cache”,首次搜索耗时稍长,之后相同的矩阵乘复用cache,不再搜索。