Multimodal Translation
Audio-Visual Learning
Machine learning combining audio and video information simultaneously to enhance understanding of multimodal scenes. This approach exploits the natural correlation between sounds and visual events.
← Indietro