Skip to content

fp16 and bfloat16 precision for dot#655

Merged
daineAMD merged 6 commits into
ROCm:developfrom
daineAMD:dotbf16
Aug 22, 2019
Merged

fp16 and bfloat16 precision for dot#655
daineAMD merged 6 commits into
ROCm:developfrom
daineAMD:dotbf16

Conversation

@daineAMD
Copy link
Copy Markdown
Contributor

Summary of proposed changes:

  • add fp16 precision with fp32 accumulation for dot
  • add bfloat16 precision with fp32 accumulation for dot

@daineAMD daineAMD requested review from amcamd and leekillough August 15, 2019 17:48
// Template to dispatch testing_gemm_ex for performance tests
// When Ti == void or complex, the test is marked invalid
// When Ti == void or complex or Ti == To == Tc == bfloat16, the test is marked invalid
template <typename Ti, typename To = Ti, typename Tc = To, typename = void>
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not see a test for Ti == complex. Can the comment be updated if there is no test for complex.

Copy link
Copy Markdown
Contributor Author

@daineAMD daineAMD Aug 20, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The gemm_ex and gemm_strided_batched_ex templates act on exclusion rather than inclusion like the other templates. The exclusion of complex types was removed when complex gemm was added. I added exclusion for bfloat16 types here as they are now permissible in type_dispatch.hpp, but not for gemm_ex. Changed the comment to reflect this in #f746ec6.

Comment thread clients/include/rocblas_common.yaml Outdated
#############################################
Half bfloat single double complex real: &half_bfloat_single_double_complex_real_precisions
- *half_precision
- *bfa_precision
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think bf16 is more descriptive than bfa. Is it possible to make this bf16?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, changed in #f746ec6.

@daineAMD
Copy link
Copy Markdown
Contributor Author

All tests (quick, pre-checkin, nightly) pass on gfx900 and gfx906.

Copy link
Copy Markdown
Contributor

@amcamd amcamd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Copy Markdown
Contributor

@leekillough leekillough left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@daineAMD daineAMD merged commit 73a01cf into ROCm:develop Aug 22, 2019
@daineAMD daineAMD deleted the dotbf16 branch October 29, 2019 20:05
mlse-lib-jenkins pushed a commit that referenced this pull request Apr 26, 2021
Co-authored-by: Andrew Chapman <andrew.chapman@amd.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants