Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement standard attention and self-attention module #52

Closed
albertz opened this issue Nov 3, 2021 · 4 comments
Closed

Implement standard attention and self-attention module #52

albertz opened this issue Nov 3, 2021 · 4 comments
Assignees
Milestone

Comments

@albertz
Copy link
Member

albertz commented Nov 3, 2021

Via cum_concat etc.
rwth-i6/returnn#391
rwth-i6/returnn#656
rwth-i6/returnn#589
rwth-i6/returnn#590

@albertz
Copy link
Member Author

albertz commented Nov 5, 2021

For this, I think we really should clarify the dim tags first (#17).

albertz added a commit that referenced this issue Nov 6, 2021
#52

Although this is still work in progress:

- The test is not really testing anything.
- Not using consistent dim tags (#17)
- Missing is normal attention
- Missing is auto-regressive self-attention
albertz added a commit that referenced this issue Nov 6, 2021
@albertz
Copy link
Member Author

albertz commented Nov 6, 2021

I implemented some first variants. Although they are not really tested, but probably we want to test them together with #55 or #58.

If anything is broken, or incomplete, or needs to be adopted for #17, I think we should just open new issues.

@albertz albertz closed this as completed Nov 6, 2021
@albertz
Copy link
Member Author

albertz commented Nov 7, 2021

Just for reference, this lacks positional encoding so far.

Some relevant discussion:
pytorch/pytorch#24826
allenai/allennlp#3398

@albertz
Copy link
Member Author

albertz commented Nov 7, 2021

See also: Pytorch MultiheadAttention, via pytorch/pytorch#18334

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants