Glosarium AI
Kamus lengkap Kecerdasan Buatan
Binary Mask
Matrix containing only 0 and 1 values where 1 indicates positions to keep and 0 those to mask, generally applied through element-wise multiplication before or after the attention softmax.
Triangular Causal Mask
Triangular matrix structure where elements above the diagonal are masked, creating strict temporal dependency in transformer models for sequential tasks.
Variable Length Mask
Dynamic mask that adapts to variable sequence lengths in a batch, optimizing computation by ignoring irrelevant positions while preserving batch alignment.
Key Padding Mask
Specific mask applied to keys in the attention mechanism to prevent padding tokens from influencing attention scores, typically added before the softmax operation.
Query Mask
Mask applied to queries to restrict which positions can perform attention queries, used in specialized architectures requiring granular control of interactions.
Value Mask
Mask applied to values after attention computation to filter out undesirable contributions, enabling fine post-attention control of output representations.
Attention Weight Masking
Technique consisting of applying a mask directly to attention weights after softmax to force certain contributions to zero, offering explicit control over information pathways.
Softmax Mask
Mask applied by adding a large negative value (typically -inf) to attention scores before softmax, ensuring that masked positions receive a probability close to zero.
Logit Mask
Masque appliqué au niveau des logits (scores d'attention bruts) pour exclure certaines interactions avant la normalisation softmax, préservant la distribution mathématique des scores valides.