Multimodal QA
Modality-to-Modality Alignment
Learning process that matches segments of one modality (e.g., a sentence) with relevant segments of another (e.g., an image region).
← 뒤로Learning process that matches segments of one modality (e.g., a sentence) with relevant segments of another (e.g., an image region).
← 뒤로