Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Additional positional Embeddings #296

Open
bonham79 opened this issue Dec 23, 2024 · 3 comments
Open

Additional positional Embeddings #296

bonham79 opened this issue Dec 23, 2024 · 3 comments
Assignees

Comments

@bonham79
Copy link
Collaborator

@Adamits I'll implement but want to check with you if it's worthwhile since my domain is speech:

What are your thoughts on adding in new positional embeddings to Transformer models (particularly RoPE)? iirc we're using the standard cosine ones but they're a bit old fashioned nowadays. Know if there's arguments for or against?

@kylebgorman
Copy link
Contributor

+1 I was wondering about this too, do our characteristic problems have enough data for learned positional embeddings?

@Adamits
Copy link
Collaborator

Adamits commented Dec 23, 2024

In general, I think the method we use is not considered the best one, but no idea to what extent it matters in our small-vocab character domains. Its actually a sort of interesting problem since I would think in problems with monotonic-ish alignments position representations would be very important.

I definitely think we should implement alternatives. Funny enough, a friend who works in this space and has used Yoyodyne for baselines texted me yesterday asking if we had RoPe style embeddings so it sounds like a great feature.

@bonham79
Copy link
Collaborator Author

Kk, i'll implement them and we can see how they work off a fork. From some posts it looks like ~10 loc so not the craziest implementation. If we're seeing overfitting we can try to apply some context window limiting or the like.

Main things I'm thinking

  • RoPE
  • Absolute embeddings
  • No embedding (NoPE) (some people have been arguing PEs aren't really that important, so seems like worthwhile option)

@bonham79 bonham79 self-assigned this Dec 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants