Feed-Forward Networks
Inner Layer Normalization
Application of layer normalization before or after the FFN in Transformer architecture, with pre-norm and post-norm variants affecting training stability.
← Indietro