Glossario IA
Il dizionario completo dell'Intelligenza Artificiale
CLIP (Contrastive Language-Image Pre-training)
Revolutionary architecture that learns shared visual and textual representations using 400 million image-text pairs.
ALIGN (A Large-scale ImaGe and Noisy-text Embedding)
Alternative approach to CLIP using a noisy dataset of 1.8 billion image-text pairs from the Internet without filtering.
SimCLR (Simple Contrastive Learning)
Fundamental simple contrastive learning method that uses strong augmentations and projectors to learn representations.
MoCo (Momentum Contrast)
Technique using a memory queue with momentum update to maintain a large number of negative pairs.
BYOL (Bootstrap Your Own Latent)
Innovative approach eliminating the need for negative samples by using two networks with momentum update.
InfoNCE Loss
Fundamental loss function for contrastive learning based on noise contrastive estimation.
Triplet Loss
Contrastive method using triplets (anchor, positive, negative) to learn discriminative representations.
Cross-Modal Retrieval
Main application that allows searching for data from one modality using a query from another modality.
Multimodal Data Augmentation
Coordinated specific augmentation techniques across different modalities to create robust positive pairs.
Vision Transformers in Contrastive Learning
Application of Transformer architectures to contrastive learning for powerful visual representations.
Hard Negative Mining
Strategy to identify and use the most difficult negative samples to improve contrastive learning.
Temperature Scaling
Crucial parameter controlling the concentration of the distribution in contrastive loss functions.
Multimodal Fusion Strategies
Different approaches for combining information from multiple modalities before or after contrast.
Self-Supervised Pre-training
Use of contrastive learning to pre-train models without supervised annotations.
Contrastive Learning for Audio-Text
Extending contrastive methods to audio-text pairs for applications such as transcription and audio search.