Attention Mechanism - AI-ordlista

📖

termer

Query-Key-Value (QKV)

Triple of fundamental vectors in attention where Query searches for information, Key identifies available information, and Value contains the information to be extracted.

📖

termer

Attention Weights

Normalized coefficients indicating the relative importance of each input element, typically obtained after applying softmax to attention scores.

📖

termer

Positional Encoding

Information added to embeddings to indicate the position of tokens in a sequence, compensating for the lack of recurrence in transformers.

📖

termer

Causal Attention

Type of masked attention where each position can only attend to previous positions, primarily used in text generation tasks.

📖

termer

Attention Score

Raw value calculated between a query and a key before normalization, quantifying the relevance or compatibility between these two elements.

📖

termer

Softmax Normalization

Activation function applied to attention scores to convert them into a probability distribution, ensuring that the sum of weights equals 1.

📖

termer

Attention Head

Individual sub-mechanism in multi-head attention, where each head learns to focus on different aspects or relationships in the data.

📖

termer

Attention Matrix

Two-dimensional representation of attention weights showing how each input element attends to all other elements in the sequence.

📖

termer

Kernel Attention

Approach using kernel functions to compute attention weights, allowing more complex non-linear relationships between elements.

📖

termer

Sparse Attention

Optimization reducing the number of computed attention connections by considering only the most relevant pairs, improving computational efficiency.

AI-ordlista

Query-Key-Value (QKV)

Attention Weights

Positional Encoding

Causal Attention

Attention Score

Softmax Normalization

Attention Head

Attention Matrix

Kernel Attention

Sparse Attention

Inga resultat hittades