AI Glossary
The complete dictionary of Artificial Intelligence
162
categories
2,032
subcategories
23,060
terms
terms
Causal Masking
Technique in the decoder that masks all future positions to ensure that the prediction for position i only depends on positions 1 to i, respecting the auto-regressive nature of generation.
terms
Output Projection
Final linear layer that projects the decoder representations into the vocabulary space, followed by a softmax to produce a probability distribution over possible tokens at each output position.
🔍