Modal Alignment
Grounding
Process of associating abstract concepts (often textual) with concrete elements in another modality (typically visual). Establishes direct links between words and specific regions or objects in images.
← 뒤로