Multi-Head Attention
Attention Distribution
Probability distribution over sequence elements generated by softmax, indicating where the model 'looks' when processing a specific element.
← ZurückProbability distribution over sequence elements generated by softmax, indicating where the model 'looks' when processing a specific element.
← Zurück