*After all this time, nobody has explained transformers better than this 👇
Transformers are able to predict the next word…but for everything. Text. Images. Audio. Code.
The secret? Self-attention.
Every token looks at every other token and asks:
“how relevant are you to me right now?”
“The cat sat on the mat because it was tired.”
What does “it” refer to? The cat. Not the mat.
Attention fi…*