[Feature request] Rotary positional embedding in cross attention #303

Aceticia · 2024-12-10T03:42:01Z

It's me again :)

It would be nice if cross attention models can accept a context_pos kwarg, mirroring the behavior of pos when rotary_pos_emb=True. This doesn't make sense in the context of encoder-decoder transformer models, but for my MAE pretraining it actually makes sense because the encoder and decoder position both refer to the same 1D space.

Considering that the current behavior in cross attention is to simply ignore positions and rotary positional embeddings, I propose to keep the same behavior unless a context_pos is explicitly passed in, now that custom positions are supported. What do you think?

The text was updated successfully, but these errors were encountered:

Aceticia · 2024-12-10T04:38:59Z

Quick hack implementation here: https://github.com/Aceticia/x-transformers/tree/cross_attn_rot_pos

…xt is in shared positional space as input

lucidrains · 2024-12-10T15:16:41Z

@Aceticia hey Chris, yes indeed i've actually came across this need a couple times in the past, but wasn't sure if the general public would

do you want to take a look at the latest commit and see if that works?

Aceticia · 2024-12-10T17:34:44Z

This looks good! Maybe the test could also test passing in custom position for the self attention part just for completeness? This is the scenario I use it in mostly

lucidrains · 2024-12-10T17:36:51Z

@Aceticia good idea, how about now?

Aceticia · 2024-12-10T17:38:15Z

Looks great! Speedy as always

Aceticia · 2024-12-10T17:40:00Z

Tested and it works perfectly. Thanks! Closing

lucidrains · 2024-12-10T17:40:44Z

need to do the wolfram setup so i can send code to the cloud while walking the dog. maybe in the near future with AR glasses + some gesture interface

Aceticia · 2024-12-10T17:47:24Z

@lucidrains I was just playing around with things and I just realized there is a new bug - if you turn on rotary pos emb and give context pos as input, cross attender will fail since there is no mem.

lucidrains · 2024-12-10T17:50:08Z

@Aceticia oh do you have the stack trace for that?

Aceticia · 2024-12-10T17:51:17Z

File "/gpfs/data/oermannlab/users/xl3942/.conda/envs/neuro/lib/python3.12/site-packages/x_transformers/x_transformers.py", line 1992, in forward
    maybe_mem = mems[0] # todo - handle edge case where different layers get different memory lengths. don't think this will ever come up but who knows
                ~~~~^^^
IndexError: list index out of range

Should be an easy fix?

lucidrains · 2024-12-10T17:59:16Z

@Aceticia ohh, i think you have a network without any self attention layers? added a quick fix as it should be valid anyways

Aceticia · 2024-12-10T18:03:18Z

Yes, now it's good. Thanks!

lucidrains added a commit that referenced this issue Dec 10, 2024

address #303, allow for rotary embedding for cross attention if conte…

70c59b6

…xt is in shared positional space as input

Aceticia closed this as completed Dec 10, 2024

Aceticia reopened this Dec 10, 2024

Aceticia closed this as completed Dec 10, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature request] Rotary positional embedding in cross attention #303

[Feature request] Rotary positional embedding in cross attention #303

Aceticia commented Dec 10, 2024

Aceticia commented Dec 10, 2024

lucidrains commented Dec 10, 2024

Aceticia commented Dec 10, 2024

lucidrains commented Dec 10, 2024

Aceticia commented Dec 10, 2024

Aceticia commented Dec 10, 2024

lucidrains commented Dec 10, 2024 •

edited

Loading

Aceticia commented Dec 10, 2024

lucidrains commented Dec 10, 2024

Aceticia commented Dec 10, 2024

lucidrains commented Dec 10, 2024

Aceticia commented Dec 10, 2024

[Feature request] Rotary positional embedding in cross attention #303

[Feature request] Rotary positional embedding in cross attention #303

Comments

Aceticia commented Dec 10, 2024

Aceticia commented Dec 10, 2024

lucidrains commented Dec 10, 2024

Aceticia commented Dec 10, 2024

lucidrains commented Dec 10, 2024

Aceticia commented Dec 10, 2024

Aceticia commented Dec 10, 2024

lucidrains commented Dec 10, 2024 • edited Loading

Aceticia commented Dec 10, 2024

lucidrains commented Dec 10, 2024

Aceticia commented Dec 10, 2024

lucidrains commented Dec 10, 2024

Aceticia commented Dec 10, 2024

lucidrains commented Dec 10, 2024 •

edited

Loading