You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
@breuderink ohh this is actually a trick from a Shazeer paper https://arxiv.org/pdf/2002.05202.pdf that should give an extra performance boost, but i should probably make it optional to stay faithful to the original paper
Reading the code I found the following implementation for the feed-forward MLP of the Perceiver IO:
I could not find references to a gated GELU in the PerceiverIO paper nor in in the code.
Is there a particular to use
GEGLU
instead of GELU?The text was updated successfully, but these errors were encountered: