*After all this time, nobody has explained transformers better than this 👇

Transformers are able to predict the next word…but for everything. Text. Images. Audio. Code.

The secret? Self-attention.

Every token looks at every other token and asks:

“how relevant are you to me right now?”

“The cat sat on the mat because it was tired.”

What does “it” refer to? The cat. Not the mat.

Attention fi…*

Previous Post Next Post