ROCm 2.9 merge develop into master#690
Merged
Merged
Conversation
merge master into develop for ROCm 2.8
* Changes to CMake to support changes in Tensile * Fixing transposed arguments. * Adding missing dependency * Adding additional required dependencies and CMake flags. * Fixing transposed arguments.
1. complex gemm(_ex) and gemm_strided_batched(ex) implementation 2. rocblas test and benchmark for complex
…all script (ROCm#646) * Adding OpenMP and pthreads to rocBLAS cmake files * added llvm to dockerfile
complex gemm implementation
add tuned logic files for resnet and inception sizes
* Remove \0s from xargs input, to avoid corrupted characters
a separate client package can be built optionally with make package_clients
Pinning Tensile version to allow for stable Jenkins runs
* Enable unit tests for gemv_batched and gemv_strided_batched - Add function templates for gemv_strided_batched and gemv_batched (in rocblas_gemv.hpp) to enable correct calls of these functions from other functions or from outside rocblas. - Add batch and strides checks and quick return in rocblas_gemv_batched.cpp and rocblas_gemv_strided_batched.cpp - Add unit tests testing_gemv_batched.hpp and testing_gemv_strided_batched.hpp - Add new class device_batch_vector in rocblas_vector.hpp. Needed for the batched case. - Add new template headers to rocblas.hpp - Add new template header and especializations for norm_check_general to work with the batched case (in norm.hpp and norm.cpp) - Add new template and espcializations for unit_check_general to work with the batched case (in unit.hpp) - Add new arguments, stride_x and stride_y (needed to test gemv_strided_batched) in rocblas_arguments.hpp and rocblas_common.yaml. Set stride_x and stride_y defaults to zero in rocblas_common.yaml to correctly generate the tests of those functions that do not need these arguments - Include the new tests in client.cpp as well as a description of the new arguments - Add the new functions in rocblas_template.yaml to process YAML from log files - Add batched and strided_batched template test cases in gemv_gtest.cpp - Add new yaml test-data files gemv_batched_gtest.yaml and gemv_strided_batched_gtest.yaml - Include the new yaml files in rocblas_gtest.yaml - Add the new yaml files to the list of dependencies for rocblas_gtest.data in CMakeLists.txt * Clang formatting * Resolve merge conflicts * clang formating * Correct bugs in gemv complex
* Updating to new Tensile cmake * Updating to latest Tensile tag
arcturus sync1 with vega20 commit 8b8defc
* Addition of rocblas_half and rocblas_bfloat16 precisions for dot.
Switching to use BLIS as the CPU reference library and reduce test duration significantly
Added rot, rotg, rotm, rotmg and test code, real and complex.
…s_abs() function (ROCm#678) * Replace explicit conversion of bloat16 to float and double with implicit conversion of bfloat16 to float * Fix std::abs for rocblas_bfloat16 * Change to using rocblas_abs instead of std::abs for when __device__ and __host__ are both required
* Updating Tensile Tag to 714d394f34e9b4aedda10714e63a68a49db6b876 (Sep 3, 2019)
update logic file for new sizes
* Adding flag to install script to choose between blis and lapack for cpu reference test code
Thrown errors were lost due to bug in check_exit_code.
Add cgemm asm lite logic yaml
Adding ger_batch and ger_strided_batch w/ testing
update logic using flex batch tuning (every passes locally)
amdkila
approved these changes
Sep 11, 2019
mlse-lib-jenkins
pushed a commit
that referenced
this pull request
May 21, 2021
* Removing g_ prefix. * Adding w_ prefix.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.