What Are Transformer AI/ML Models and How Do They Work?

April 16, 2023

Great Blog on how transfomers work. They are formed by several blocks, each one with its own function, working together to understand text and generate the next word. These blocks are the following:

Tokenizer : Turns words into tokens.
Embedding : Turns tokens into numbers (vectors)
Positional encoding : Adds order to the words in the text.
Transformer block : Guesses the next word. It is formed by an attention block and a feedforward block.
Attention : Adds context to the text.
Feedforward : Is a block in the transformer neural network, which guesses the next word.
Softmax : Turns the scores into probabilities in order to sample the next word.
And, finally post training.

Cohere AI Blog

Share on

Twitter Facebook LinkedIn

Krishna Bala, Ph.D.

What Are Transformer AI/ML Models and How Do They Work?

Share on

You may also enjoy

Founder Mode

Google DeepMind’s new AlphaFold can model a much larger slice of biological life

Engineers and Tech Leaders Looking to Learn And Grow

How to Create a Culture Of Performance in Teams