Layer Normalization
Pre-Layer Normalization
Variant of layer normalization applied before attention and feed-forward sublayers, improving training stability in deep Transformers.
← 뒤로Variant of layer normalization applied before attention and feed-forward sublayers, improving training stability in deep Transformers.
← 뒤로