Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Codegen][cuda-fp16] fallback to fp32 simulation when cuda arch < sm53 #4268

Merged
merged 8 commits into from
Nov 10, 2019

Conversation

yzhliu
Copy link
Member

@yzhliu yzhliu commented Nov 7, 2019

This ensures fp16 can still be computed on old devices.
The code is borrowed from mshadow.

An immediate usage will be tvm op support in mxnet - mxnet is able to run fp16 on device which does not support fp16 natively.

@vinx13 @tqchen @Laurawly @reminisce Please advise.

@tqchen
Copy link
Member

tqchen commented Nov 7, 2019

I am not sure if we really want to enable fp16 on devices that does not have native support. never-the-less. I think we can enable it. However, it would be great to separate this dep lib into a separate const string var that uses string literal to include the files. So we won't have a long string in the content.

@tqchen tqchen added the status: need update need update based on feedbacks label Nov 7, 2019
@vinx13
Copy link
Member

vinx13 commented Nov 7, 2019

I agree better we can put the code in a separate file instead of string concatenation. The code itself is good

@yzhliu
Copy link
Member Author

yzhliu commented Nov 8, 2019

@vinx13 @tqchen please review again

@@ -50,6 +50,9 @@ void CodeGenCUDA::AddFunction(LoweredFunc f) {

std::string CodeGenCUDA::Finish() {
if (enable_fp16_) {
static constexpr const char* _cuda_half_t_def =
#include "literal/cuda_half_t.txt"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

consider still keep it as a header file so we can apply the license header, define a global constant

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tqchen addressed.

@tqchen tqchen merged commit 801cf0e into apache:master Nov 10, 2019
@tqchen
Copy link
Member

tqchen commented Nov 10, 2019

Thanks @yzhliu @vinx13 !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants