Multi-Modal Diffusion
Cross-Modal Conditioning
Technique where the generation process of one modality is guided by information from another modality, for example generating an image from text or audio from an image.
← Terug