Encoders and Decoders
Layer Normalization
Training stabilization technique normalizing activations for each position, applied before or after sub-layers in the transformer architecture.
← GeriTraining stabilization technique normalizing activations for each position, applied before or after sub-layers in the transformer architecture.
← Geri