cast local_scale_tensor
to fp32 for precompute of fp8 dynamic scaling#713
Merged
msaroufim merged 4 commits intopytorch:mainfrom crcrpar:cast_precompute_scale_to_fp32Aug 22, 2024
+8-4