Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Use vector load for HIP FP16 in Vec4T
Summary: Before this diff, HIP does 4 sequential scalar loads for the half input in TBE's Vec4T. This diff does a vector load for 4 halves. Reviewed By: jspark1105 Differential Revision: D39267283 fbshipit-source-id: 089451de9b79a0219ae5aef9b41bbfcb292f8ce2
- Loading branch information