fp16 blas implementation patch#5
Conversation
| } | ||
| stream << "{" << std::endl; | ||
| stream.inc_tab(); | ||
|
|
There was a problem hiding this comment.
Probably these two branches for the kernel argument can be replaced by:
std::string abdtype = (sdtype=="half")?"float":sdtype
There was a problem hiding this comment.
Yes, will modified with your suggestion
|
|
||
| union ISAACAPI values_holder | ||
| struct ISAACAPI values_holder | ||
| { |
There was a problem hiding this comment.
Why couldn't we keep a union here? Using a struct instead will multiply the size of the class by 4, which will increase the (already high) overhead associated with expression trees construction and parsing
There was a problem hiding this comment.
This is because of the introducing of class half, use struct can avoid compiling error.
| case FLOAT_TYPE:{float t = v.float32;return t;} break; | ||
| case DOUBLE_TYPE: { double t = v.float64;return t;} break; | ||
| default: throw unknown_datatype(dtype_); | ||
| } |
There was a problem hiding this comment.
Wouldn't it work to just keep a macro here (if the union is not changed to a struct), even for the half-type?
There was a problem hiding this comment.
Current code has to use struct mentioned above, so comment out the macro here.
| } | ||
| } | ||
| } | ||
| else{ |
There was a problem hiding this comment.
Again, I would prefer a single declaration of:
numeric_type abtype = (dtype==HALF_TYPE)?:FLOAT_TYPE:dtype;
over all these branches :)
There was a problem hiding this comment.
Will follow your suggestion and modified them, thx
| INSTANTIATEHALFOP(double) | ||
|
|
||
| } | ||
|
|
There was a problem hiding this comment.
To be honest, I don't understand the purpose of this class, as opposed to some uint16_t, since cl_mem is blind to the underlying datatype anyway.
There was a problem hiding this comment.
This class is to add the data type of half and not just the type define of uint16_t. With the help of this class, we can let the code see "half" in the process without compiling errors.
Change-Id: If50aacd9754dcea3ab333d6e832bb38a5e952c8a
Change-Id: I6040dbf2b4372dcb7304566242fb39d377681459
|
Have refined the code to remove half on value scalar, please have a check, thank you! |
|
That looks good to me! Feel free to submit a PR in isaac's dev repository :) |
|
@ptillet will submit to your repo soon. @listenlink Thanks for your contribution. |
Hi @gongzg .
This patch implement the fp16 functionality, kernel function/ blas api/ tuning mechanism are all included.
Then can pass all of our clBLAS half test suite.
While I don't add the intelblas_gemm half version currently, because our intel_gemm implementation are still under review at upstream, this PR can be a clean and independent patch for upstream to merge.