YZ Sözlüğü
Yapay Zekanın tam sözlüğü
Attention Mechanism
Mathematical foundation allowing models to weight the relative importance of elements in a data sequence.
Self-Attention
Mechanism where each element of a sequence computes its attention relative to all other elements in the same sequence.
Multi-Head Attention
Attention extension using multiple attention heads in parallel to capture different types of relationships.
Positional Encoding
Technique for incorporating the sequential position of elements into embeddings without using an RNN.
Encoder-Decoder Architecture
Fundamental structure of Transformers separating input processing (encoder) and output generation (decoder).
Attention Scaling
Square root of dimensionality normalization to stabilize training and prevent exploding gradients.
Cross-Attention
Attention mechanism between two different sequences, used in translation and multimodal tasks.
Sparse Attention
Variant of attention computed only on a subset of positions to reduce computational complexity.
Attention Masks
Control mechanisms allowing to mask certain positions during attention computation to prevent information leakage.
Vision Transformers
Adaptation of the Transformer architecture to computer vision tasks by treating images as sequences of patches.
Efficient Attention
Set of optimizations aimed at reducing the quadratic complexity of standard attention for longer sequences.
Hierarchical Attention
Multi-level attention structure capturing relationships at different hierarchical scales in the data.