KI-Glossar
Das vollständige Wörterbuch der Künstlichen Intelligenz
Conditional GANs
Generative adversarial networks that incorporate conditional information to guide data generation according to specified attributes.
Multi-Modal VAEs
Variational autoencoders designed to learn shared latent representations between different data modalities.
Feature Fusion
Technique combining features extracted from different modalities into a unified enriched representation.
Multi-Modal Transformers
Transformer architecture adapted to process multiple types of data simultaneously through cross-attention mechanisms.
CLIP
Pre-trained model on image-text pairs using contrastive learning to align visual and textual representations.
Multi-Modal Diffusion
Diffusion generation process coordinating multiple modalities through a shared latent space.
Co-Generation
Simultaneous generation of multi-modal data ensuring consistency and synchronization between them.
Joint Encoding
Method encoding different modalities in the same vector space to capture their semantic relationships.
Cross-Decoders
Decoding architecture using one modality as input to generate another modality in a coherent manner.
Multi-Modal Attention
Attention mechanism weighting the importance of relationships between different modalities during processing.
Shared Latent Space
Common vector representation where different modalities are projected to facilitate their interactions.
Coordinated Synthesis
Generation of multi-modal data where each modality is produced in coordination with others.
Text-to-Image Models
Systems generating images from textual descriptions while maintaining semantic coherence.
Audio-to-Visual Models
Architecture transforming audio signals into synchronized and coherent visual representations.
Temporal Consistency
Property ensuring the coherence of generated data over time in multi-modal sequences.
Audio-Video Synchronization
Precise temporal alignment between generated audio and video tracks to ensure their coherence.
Modal Alignment Metrics
Quantitative indicators evaluating the quality of semantic alignment between different generated modalities.
Multi-Modal Zero-Shot Transfer
Ability of models to generalize to new modality combinations without specific training.
Multi-Modal Contrastive Learning
Training method that maximizes similarity between positive modal pairs and minimizes that of negative pairs.