Skip to content

Conversation

@eqy
Copy link
Collaborator

@eqy eqy commented Feb 4, 2022

Followup to #1274, which left out (correct) documentation

Copy link
Collaborator

@crcrpar crcrpar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The generated docs look nice

image

@crcrpar crcrpar merged commit a786ca0 into NVIDIA:master Feb 7, 2022
.. math::
y = \frac{x - \mathrm{E}[x]}{ \sqrt{\mathrm{Var}[x] + \epsilon}} * \gamma + \beta
y = \frac{x}{\mathrm{RMS}[x]} * \gamma
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's missing epsilon and I checked it's in the code.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was left out as the syntax looked kind of involved... feel free to suggest a fix!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@crcrpar crcrpar added this to the 22.03 milestone Feb 22, 2022
hubertlu-tw pushed a commit to ROCm/apex that referenced this pull request Apr 15, 2022
jithunnair-amd pushed a commit to ROCm/apex that referenced this pull request Aug 5, 2022
* FusedRMSNorm/"T5LayerNorm" based on FusedLayerNorm (NVIDIA#1274)

* FusedRMSNorm based on FusedLayerNorm

* refactor duplicated kernels

* delete comments

* delete comments

* cleanup

* cleanup

* cleanup, fixed clobbering forward_affine_mixed_dtypes

* fix pybind naming and add MixedFused test

* undo skipping

* check elementwise_affine

* Update tests/L0/run_fused_layer_norm/test_fused_layer_norm.py

Oof, nice catch, thanks

Co-authored-by: Masaki Kozuki <masaki.kozuki.2014@gmail.com>

Co-authored-by: Masaki Kozuki <masaki.kozuki.2014@gmail.com>

* fix and generate docs for FusedRMSNorm (NVIDIA#1285)

* [FusedRMSNorm doc] document where epsilon is added (NVIDIA#1295)

* [FusedRMSNorm doc] add epsilon to formula

* correct

* better wording

* Fix some bugs

* Optimize HostRMSNormGradient and HostApplyRMSNorm for AMD GPUs

* Fix NaN issues in FusedRMSNorm

* Update test_fused_layer_norm.py

* Skip test_fused_layer_norm.TestAutocastFusedRMSNorm on ROCm

* Use at::cuda::warp_size() instead of at::cuda::getCurrentDeviceProperties()->warpSize

Co-authored-by: eqy <eddiey@nvidia.com>
Co-authored-by: Masaki Kozuki <masaki.kozuki.2014@gmail.com>
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants