Multimodal Translation
Multimodal Transformers
Transformer architecture adapted to simultaneously process multiple data modalities (text, image, audio) through cross-modal attention mechanisms. These models unify the representation and processing of heterogeneous data.
← Wstecz