attention

Implementation of causal self-attention that closely follows 3Blue1Brown's "Attention in transformers, visually explained | Chapter 6, Deep Learning" video.

Files

attention.ipynb contains the code sections that closely corresponds to what 3B1B is talking about in the video in chronological order (implements causal self-attention).

attention.py has the code bundled into a nice class ready for use (does not have causal attention).

Usage

Single head

import attention

d_in, d_out_kq, d_out_v = 120, 12, 120
s = attention.SelfAttention(d_in, d_out_kq, d_out_v)
s(x)

Output shape: number of tokens, token embedding dimension

Multi-head

import attention

d_in, d_out_kq, d_out_v, num_heads = 120, 12, 120, 4
m = attention.MultiHeadAttention(d_in, d_out_kq, d_out_v, num_heads)
m(x)

Output shape: num_heads, number of tokens, token embedding dimension

TODO:

Implement causal self-attention in attention.py.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
.gitignore		.gitignore
README.md		README.md
attention.ipynb		attention.ipynb
attention.py		attention.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

attention

Files

Usage

Single head

Multi-head

TODO:

About

Releases

Packages

Languages

SuchitG04/attention

Folders and files

Latest commit

History

Repository files navigation

attention

Files

Usage

Single head

Multi-head

TODO:

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages