KI-Glossar
Das vollständige Wörterbuch der Künstlichen Intelligenz
Position Encoding
Technique that adds absolute or relative position information to input embeddings to allow the model to understand the order of tokens in a sequence.
Pre-Layer Normalization
Variant of layer normalization applied before attention and feed-forward sublayers, improving training stability in deep Transformers.
Post-Layer Normalization
Original configuration of layer normalization applied after attention and feed-forward sublayers, combined with residual connections.
RMS Normalization
Efficient variant of layer normalization using only the root mean square of activations, reducing computational complexity while maintaining performance.
Encoder-Decoder Structure
Bidirectional architecture composed of an encoder processing the input sequence and a decoder generating the output sequence, foundation of Transformer models.
LayerNorm Epsilon
Numerical stability parameter added in layer normalization to avoid division by zero when calculating the variance of activations.