Self-Attention
Softmax Normalization
Activation function transforming attention scores into probability distribution, ensuring that the sum of attention weights equals 1 for each position.
← 뒤로Activation function transforming attention scores into probability distribution, ensuring that the sum of attention weights equals 1 for each position.
← 뒤로