Additional positional Embeddings #296

bonham79 · 2024-12-23T19:27:31Z

@Adamits I'll implement but want to check with you if it's worthwhile since my domain is speech:

What are your thoughts on adding in new positional embeddings to Transformer models (particularly RoPE)? iirc we're using the standard cosine ones but they're a bit old fashioned nowadays. Know if there's arguments for or against?

kylebgorman · 2024-12-23T19:57:31Z

+1 I was wondering about this too, do our characteristic problems have enough data for learned positional embeddings?

Adamits · 2024-12-23T20:46:54Z

In general, I think the method we use is not considered the best one, but no idea to what extent it matters in our small-vocab character domains. Its actually a sort of interesting problem since I would think in problems with monotonic-ish alignments position representations would be very important.

I definitely think we should implement alternatives. Funny enough, a friend who works in this space and has used Yoyodyne for baselines texted me yesterday asking if we had RoPe style embeddings so it sounds like a great feature.

bonham79 · 2024-12-23T22:26:20Z

Kk, i'll implement them and we can see how they work off a fork. From some posts it looks like ~10 loc so not the craziest implementation. If we're seeing overfitting we can try to apply some context window limiting or the like.

Main things I'm thinking

RoPE
Absolute embeddings
No embedding (NoPE) (some people have been arguing PEs aren't really that important, so seems like worthwhile option)

bonham79 self-assigned this Dec 23, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Additional positional Embeddings #296

Additional positional Embeddings #296

bonham79 commented Dec 23, 2024

kylebgorman commented Dec 23, 2024

Adamits commented Dec 23, 2024

bonham79 commented Dec 23, 2024

Additional positional Embeddings #296

Additional positional Embeddings #296

Comments

bonham79 commented Dec 23, 2024

kylebgorman commented Dec 23, 2024

Adamits commented Dec 23, 2024

bonham79 commented Dec 23, 2024