Cross-Attention
Cross-Attention Regularization
Constraint techniques applied to cross-attention weights to encourage desirable properties such as sparsity, diversity, or temporal coherence. Improves model interpretability and generalization.
← Kembali