YZ Sözlüğü
Yapay Zekanın tam sözlüğü
162
kategoriler
2.032
alt kategoriler
23.060
terimler
terimler
Concatenation and Linear Projection
Final step of multi-head attention where the outputs of all heads are concatenated then linearly projected to restore the model dimension, thus merging information from different subspaces.
terimler
Causal Attention (Masked Self-Attention)
Type of self-attention used in decoders where masked attention is applied to prevent a token from attending to future tokens, ensuring the auto-regressive nature of the model.
terimler
Head Dimension (d_k)
Dimension of key and value vectors in each attention head, calculated by dividing the model dimension by the number of heads, influencing the representational capacity of each head.
terimler
Linearized Attention
Family of attention mechanisms that rewrite the attention calculation to avoid materializing the full attention matrix, allowing linear complexity relative to the sequence length.
🔍