KI-Glossar
Das vollständige Wörterbuch der Künstlichen Intelligenz
Query-Key-Value (QKV)
Triple of fundamental vectors in attention where Query searches for information, Key identifies available information, and Value contains the information to be extracted.
Attention Weights
Normalized coefficients indicating the relative importance of each input element, typically obtained after applying softmax to attention scores.
Positional Encoding
Information added to embeddings to indicate the position of tokens in a sequence, compensating for the lack of recurrence in transformers.
Causal Attention
Type of masked attention where each position can only attend to previous positions, primarily used in text generation tasks.
Attention Score
Raw value calculated between a query and a key before normalization, quantifying the relevance or compatibility between these two elements.
Softmax Normalization
Activation function applied to attention scores to convert them into a probability distribution, ensuring that the sum of weights equals 1.
Attention Head
Individual sub-mechanism in multi-head attention, where each head learns to focus on different aspects or relationships in the data.
Attention Matrix
Two-dimensional representation of attention weights showing how each input element attends to all other elements in the sequence.
Kernel Attention
Approach using kernel functions to compute attention weights, allowing more complex non-linear relationships between elements.
Sparse Attention
Optimization reducing the number of computed attention connections by considering only the most relevant pairs, improving computational efficiency.