-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Codegen][cuda-fp16] fallback to fp32 simulation when cuda arch < sm53 #4268
Conversation
I am not sure if we really want to enable fp16 on devices that does not have native support. never-the-less. I think we can enable it. However, it would be great to separate this dep lib into a separate const string var that uses string literal to include the files. So we won't have a long string in the content. |
I agree better we can put the code in a separate file instead of string concatenation. The code itself is good |
src/codegen/codegen_cuda.cc
Outdated
@@ -50,6 +50,9 @@ void CodeGenCUDA::AddFunction(LoweredFunc f) { | |||
|
|||
std::string CodeGenCUDA::Finish() { | |||
if (enable_fp16_) { | |||
static constexpr const char* _cuda_half_t_def = | |||
#include "literal/cuda_half_t.txt" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
consider still keep it as a header file so we can apply the license header, define a global constant
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tqchen addressed.
This ensures fp16 can still be computed on old devices.
The code is borrowed from mshadow.
An immediate usage will be tvm op support in mxnet - mxnet is able to run fp16 on device which does not support fp16 natively.
@vinx13 @tqchen @Laurawly @reminisce Please advise.