[Codegen][cuda-fp16] fallback to fp32 simulation when cuda arch < sm53 #4268

yzhliu · 2019-11-07T01:15:28Z

This ensures fp16 can still be computed on old devices.
The code is borrowed from mshadow.

An immediate usage will be tvm op support in mxnet - mxnet is able to run fp16 on device which does not support fp16 natively.

@vinx13 @tqchen @Laurawly @reminisce Please advise.

tqchen · 2019-11-07T04:39:29Z

I am not sure if we really want to enable fp16 on devices that does not have native support. never-the-less. I think we can enable it. However, it would be great to separate this dep lib into a separate const string var that uses string literal to include the files. So we won't have a long string in the content.

vinx13 · 2019-11-07T17:24:19Z

I agree better we can put the code in a separate file instead of string concatenation. The code itself is good

yzhliu · 2019-11-08T03:22:04Z

@vinx13 @tqchen please review again

tqchen · 2019-11-08T04:31:11Z

src/codegen/codegen_cuda.cc

@@ -50,6 +50,9 @@ void CodeGenCUDA::AddFunction(LoweredFunc f) {

 std::string CodeGenCUDA::Finish() {
  if (enable_fp16_) {
+    static constexpr const char* _cuda_half_t_def =
+    #include "literal/cuda_half_t.txt"


consider still keep it as a header file so we can apply the license header, define a global constant

@tqchen addressed.

tqchen · 2019-11-10T06:16:38Z

Thanks @yzhliu @vinx13 !

apache#4268)

yzhliu added 3 commits November 6, 2019 16:36

check __CUDA_ARCH__ in cuda codegen for fp16

35e93ac

merge from upstream

d4ec505

fix lint

50b8684

tqchen assigned vinx13 Nov 7, 2019

tqchen added the status: need update need update based on feedbacks label Nov 7, 2019

yzhliu added 3 commits November 7, 2019 19:15

use c++ literal

406c4f5

Merge remote-tracking branch 'upstream/master' into cu_fp16

a292dab

fix lint

b105241

tqchen requested changes Nov 8, 2019

View reviewed changes

yzhliu added 2 commits November 8, 2019 09:20

str literal move to .h file

8625032

Merge remote-tracking branch 'upstream/master' into cu_fp16

2ccfad4

yzhliu added the status: need review label Nov 9, 2019

vinx13 approved these changes Nov 10, 2019

View reviewed changes

tqchen approved these changes Nov 10, 2019

View reviewed changes

tqchen merged commit 801cf0e into apache:master Nov 10, 2019

tqchen added status: accepted and removed status: need review status: need update need update based on feedbacks labels Nov 10, 2019

reminisce mentioned this pull request Nov 11, 2019

Update TVM submodule apache/mxnet#16777

Merged

zxy844288792 pushed a commit to neo-ai/tvm that referenced this pull request Nov 13, 2019

[Codegen][cuda-fp16] fallback to fp32 simulation when cuda arch < sm53 (

712b30e

apache#4268)

tqchen mentioned this pull request Nov 16, 2019

[RELEASE][DRAFT] TVM v0.6 Release candidate #4259

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Codegen][cuda-fp16] fallback to fp32 simulation when cuda arch < sm53 #4268

[Codegen][cuda-fp16] fallback to fp32 simulation when cuda arch < sm53 #4268

yzhliu commented Nov 7, 2019

tqchen commented Nov 7, 2019

vinx13 commented Nov 7, 2019

yzhliu commented Nov 8, 2019

tqchen Nov 8, 2019

yzhliu Nov 8, 2019

tqchen commented Nov 10, 2019 •

edited

Loading

[Codegen][cuda-fp16] fallback to fp32 simulation when cuda arch < sm53 #4268

[Codegen][cuda-fp16] fallback to fp32 simulation when cuda arch < sm53 #4268

Conversation

yzhliu commented Nov 7, 2019

tqchen commented Nov 7, 2019

vinx13 commented Nov 7, 2019

yzhliu commented Nov 8, 2019

tqchen Nov 8, 2019

Choose a reason for hiding this comment

yzhliu Nov 8, 2019

Choose a reason for hiding this comment

tqchen commented Nov 10, 2019 • edited Loading

tqchen commented Nov 10, 2019 •

edited

Loading