Diffusion Models
Cross-Attention
Attention mechanism allowing diffusion models to effectively merge textual and visual information during denoising. This architecture is crucial for semantic coherence in text-to-image generation.
← Indietro