We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
2.1.5中EncoderLayer里涉及到add&norm的操作,貌似都是先对原始输入做norm,然后将norm后的数据输入MHA/FFN,然后再和原始输入相加,这不是pre-norm的操作吗?按理说原生的Transformer不都是post-norm嘛~
The text was updated successfully, but these errors were encountered:
看到这里同样有疑问,个人会这样实现: x = norm(x + attention(x,x,x,mask)) x =norm(x + ffn(x))
Sorry, something went wrong.
yes, I find the following is the right
https://github.com/hyunwoongko/transformer/blob/master/models/blocks/encoder_layer.py
No branches or pull requests
2.1.5中EncoderLayer里涉及到add&norm的操作,貌似都是先对原始输入做norm,然后将norm后的数据输入MHA/FFN,然后再和原始输入相加,这不是pre-norm的操作吗?按理说原生的Transformer不都是post-norm嘛~
The text was updated successfully, but these errors were encountered: