Transformers
Layer Normalization
Normalization technique applied to each layer to stabilize training by normalizing activations across feature dimensions.
← GeriNormalization technique applied to each layer to stabilize training by normalizing activations across feature dimensions.
← Geri