Enable same padding for non-register sharing warp specialization#4395
Enable same padding for non-register sharing warp specialization#4395
Conversation
|
Review updated until commit bc4e084 Description
Changes walkthrough 📝
PR Reviewer Guide 🔍Here are some key observations to aid the review process:
|
a636d75 to
b8d2f61
Compare
b8d2f61 to
ffd0946
Compare
|
!test |
|
!test |
This PR enforces same padding rules for non-register sharing warp specialization. * Replaced `std::unordered_set<ParallelType> warp_specialized_types_` with `std::optional<ParallelType> warp_specialized_parallel_type_` because we only support a single ParallelType.
New restriction on warp specialization was added in #4395 Needs to temporarily skip `ThunderRMSNormBwd`
This PR enforces same padding rules for non-register sharing warp specialization.
std::unordered_set<ParallelType> warp_specialized_types_withstd::optional<ParallelType> warp_specialized_parallel_type_because we only support a single ParallelType.