Skip to content

complex gemm implementation#626

Closed
TonyYHsieh wants to merge 8 commits into
ROCm:developfrom
TonyYHsieh:complex_gemm_base_on_PR_616
Closed

complex gemm implementation#626
TonyYHsieh wants to merge 8 commits into
ROCm:developfrom
TonyYHsieh:complex_gemm_base_on_PR_616

Conversation

@TonyYHsieh
Copy link
Copy Markdown
Contributor

  1. complex gemm(_ex) and gemm_strided_batched(ex) implementation
  2. rocblas test and benchmark for complex

resolves #___

Summary of proposed changes:

Comment thread library/include/rocblas_bfloat16.h Outdated
Comment thread library/include/rocblas-functions.h Outdated
@amcamd amcamd requested a review from daineAMD July 30, 2019 15:14
Comment thread clients/include/testing_gemm.hpp Outdated
Comment thread clients/include/testing_gemm.hpp Outdated
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rocblas_isnan() should be applied to arg.beta rather than arg.get_beta<T>().

Copy link
Copy Markdown
Contributor Author

@TonyYHsieh TonyYHsieh Aug 1, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Understand, I correct it on third commit "update complex change".
But, do we have to check nan for betai too??
It is only used in complex type.
Should we use another template function for is_nan check??

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is_nan should be defined for rocblas_float_complex and rocblas_double_complex, which will return true if either the real or imaginary part is NaN.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But since arg.beta and arg.betai are always floating-point types, you can use std::isnan(arg.beta) || std::isnan(arg.betai).

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

understand. update it in commit "update complex change 2".

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

arg.get_beta<T>() should already consider whether rocblas_isnan(arg.beta) and return accordingly.

Copy link
Copy Markdown
Contributor Author

@TonyYHsieh TonyYHsieh Aug 1, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

understand. This is my mistake.
Update it on third commit "update complex change"

Comment thread clients/include/testing_gemv.hpp Outdated
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rocblas_isnan(arg.beta) should be left alone -- it is separate from arg.get_beta<T>().

Copy link
Copy Markdown
Contributor Author

@TonyYHsieh TonyYHsieh Aug 1, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Understand, I will correct it on third commit "update complex change".
But, do we have to check nan for betai too??
It is only used in complex type.
Should we use another template function for is_nan check??

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rocblas_isnan(arg.beta) || rocblas_isnan(arg.betai). It does not matter that betai is only used during complex. It will default to 0 in non-complex cases.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

understand. update it in commit "update complex change 2".

Comment thread library/include/rocblas_bfloat16.h Outdated
@whchung
Copy link
Copy Markdown

whchung commented Jul 31, 2019

+@deven-amd +@jerryyin for awareness. CGEMM / ZGEMM in rocBLAS.

Comment thread library/src/blas3/Tensile/gemm.h Outdated
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We might want a trailing comma, for consistency

Copy link
Copy Markdown
Contributor Author

@TonyYHsieh TonyYHsieh Aug 1, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Understand.
I correct it on third commit "update complex change".

Comment thread library/src/blas_ex/rocblas_gemm_ex.hpp Outdated
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

std::is_same<T1,T2>{} can be used as shorthand instead of std::is_same<T1,T2>::value, but this is not critical.

Copy link
Copy Markdown
Contributor Author

@TonyYHsieh TonyYHsieh Aug 1, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Understand, I correct it on third commit "update complex change".

Comment thread library/src/blas_ex/rocblas_gemm_ex.hpp Outdated
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unsigned(...) can be used instead of static_cast<unsigned int>(...) for brevity.

Copy link
Copy Markdown
Contributor Author

@TonyYHsieh TonyYHsieh Aug 1, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Understand,
I correct it on third commit "update complex change".

Copy link
Copy Markdown
Contributor

@leekillough leekillough left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ignore the CI build failures, which are global. Saad is working on a fix for them.

I have verified that the full rocBLAS tests pass on GFX900 and GFX906.

I will look at the code some more, but I approve the changes, as long as:

  1. Complex values for alpha and beta are tested sufficiently.
  2. Complex values for A, B, C, D matrices are tested sufficiently.
  3. Most of the concerns I've raised in comments are addressed or downplayed, such as the isnan stuff with beta. Some of the suggestions can be addressed in a later refactoring change.

Copy link
Copy Markdown
Contributor

@amcamd amcamd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to review alpha and beta behavior when they have NaN values

@TonyYHsieh
Copy link
Copy Markdown
Contributor Author

TonyYHsieh commented Aug 6, 2019

Need to review alpha and beta behavior when they have NaN values

Library side : didn’t treat NaN as a special case, so NaN operate with other value will generate NaN.
This implementation looks same as cblas implementation.

Client side: set C as NaN if beta is NaN and set beta as 0 when beta is NaN
I follow previous NaN implementation when implement complex gemm.
So what is the expectation about NaN implementation??

@TonyYHsieh
Copy link
Copy Markdown
Contributor Author

TonyYHsieh commented Aug 6, 2019

Ignore the CI build failures, which are global. Saad is working on a fix for them.

I have verified that the full rocBLAS tests pass on GFX900 and GFX906.

I will look at the code some more, but I approve the changes, as long as:

Lee : 1. Complex values for alpha and beta are tested sufficiently.

Reply :We have add alpha and beta imaginary value in rocblas-test for complex gemm
Which matrix size and other parameter is same real type.
Do we need more other testing for complex gemm??

Lee: 2. Complex values for A, B, C, D matrices are tested sufficiently.

Reply: Same as item a. We have add complex test item in rocblas-test.
Which matrix size and other parameters is same real type.
All test item is pass.

Lee: 3. Most of the concerns I've raised in comments are addressed or downplayed, such as the isnan stuff with beta. Some of the suggestions can be addressed in a later refactoring change.

Reply :We have modified implementation according to your comment and update in other commits in this PR (do we need to squash all commits to one commit ??).
We only met issue on one comment. ”bfloat16 to double conversion”
We met build error when convert bfloat16 to double if we didn’t add double conversion.
So we have to add double conversion. Compile didn’t convert bfloat16 to float and to double when double(bfloat16{1.0}).

Comment thread clients/gtest/gemm_gtest.cpp Outdated
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would remove comments about "Complex is not supported yet".

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

understand. update it in commit "update complex change 2".

Comment thread clients/gtest/gemm_gtest.yaml Outdated
Comment thread clients/include/norm.hpp Outdated
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We use C++14 for our internal code (not external C interfaces), so it's okay to use auto as a placeholder for function return type.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

understand. update it in commit "update complex change 2".

Comment thread clients/include/norm.hpp Outdated
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO: Consolidate these functions into a smaller number of templates , varying them only by template arguments.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to wrap rocblas_half to a class form to get conversion operation from half to double.
I think this can be done later.

Comment thread clients/include/norm.hpp Outdated
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pointers to void are not allowed as template arguments. It should be changed to:
template <typename T, typename std::enable_if<!is_complex<T>, int>::type = 0>

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

understand. update it in commit "update complex change 2".

Comment thread clients/include/norm.hpp Outdated
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As above:
template <typename T, typename std::enable_if<!is_complex<T>, int>::type = 0>

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

understand. update it in commit "update complex change 2".

Comment thread clients/include/testing_gemm.hpp Outdated
Comment thread clients/include/testing_gemm_ex.hpp Outdated
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This may need to be changed to:

bool nantest = rocblas_isnan(arg.beta) || rocblas_isnan(arg.betai);

For non-complex types, betai is 0 and so rocblas_isnan(arg.betai) is false.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

understand. update it in commit "update complex change 2".

Comment thread library/src/blas3/Tensile/gemm.cpp Outdated
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment is a little off.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

understand. update it in commit "update complex change 2".

Comment thread library/src/blas_ex/rocblas_gemm_ex.hpp Outdated
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo -- COPMLEX should be COMPLEX

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

understand. update it in commit "update complex change 2".

Comment thread library/src/blas_ex/rocblas_gemm_ex.hpp Outdated
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same typo throughout file -- maybe do global search and replacce

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

understand. update it in commit "update complex change 2".

Comment thread clients/common/norm.cpp Outdated
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with moving norm.cpp to a header file norm.hpp.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for agreement :)

Copy link
Copy Markdown
Contributor

@leekillough leekillough left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Once the changes I commented on are made, and the CI tests pass, I approve the changes.

@amdkila
Copy link
Copy Markdown
Contributor

amdkila commented Aug 7, 2019

@TonyYHsieh Your PR failed CI due to minor formatting differences on just 1 file (rocblas_gemm_ex.cpp), so I re-ran the formatter. It should pass the format check now.

TonyYHsieh and others added 8 commits August 8, 2019 18:45
1. complex gemm(_ex) and gemm_strided_batched(ex) implementation
2. rocblas test and benchmark for complex
1. fix typo
2. fix beta nan condition check
3. fix invalid style of template function
4. fix bench log
5. add more alpha/beta setting for complex rocblas-test
@TonyYHsieh TonyYHsieh closed this Aug 9, 2019
mlse-lib-jenkins pushed a commit that referenced this pull request Apr 14, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants