Cross-Attention
Cross-Modal Attention
Extension of cross-attention where queries, keys, and values come from different modalities (text, image, audio), allowing multimodal models to align and fuse information between different sensory representations.
← Indietro