Layer Normalization
Gradient Stability
Property of layer normalization that maintains stable gradients during backpropagation, avoiding exploding or vanishing gradient problems in deep transformers.
← Quay lại