Layer Normalization
Activation Distribution
Distribution of activation values in a layer that layer normalization maintains constant, facilitating convergence and optimization of transformer networks.
← 뒤로