Multimodal Contrastive Learning

📂

sottocategorie

CLIP (Contrastive Language-Image Pre-training)

Revolutionary architecture that learns shared visual and textual representations using 400 million image-text pairs.

6 termini

📂

sottocategorie

ALIGN (A Large-scale ImaGe and Noisy-text Embedding)

Alternative approach to CLIP using a noisy dataset of 1.8 billion image-text pairs from the Internet without filtering.

12 termini

📂

sottocategorie

SimCLR (Simple Contrastive Learning)

Fundamental simple contrastive learning method that uses strong augmentations and projectors to learn representations.

8 termini

📂

sottocategorie

MoCo (Momentum Contrast)

Technique using a memory queue with momentum update to maintain a large number of negative pairs.

11 termini

📂

sottocategorie

BYOL (Bootstrap Your Own Latent)

Innovative approach eliminating the need for negative samples by using two networks with momentum update.

7 termini

📂

sottocategorie

InfoNCE Loss

Fundamental loss function for contrastive learning based on noise contrastive estimation.

4 termini

📂

sottocategorie

Triplet Loss

Contrastive method using triplets (anchor, positive, negative) to learn discriminative representations.

7 termini

📂

sottocategorie

Cross-Modal Retrieval

Main application that allows searching for data from one modality using a query from another modality.

8 termini

📂

sottocategorie

Multimodal Data Augmentation

Coordinated specific augmentation techniques across different modalities to create robust positive pairs.

20 termini

📂

sottocategorie

Vision Transformers in Contrastive Learning

Application of Transformer architectures to contrastive learning for powerful visual representations.

9 termini

📂

sottocategorie

Hard Negative Mining

Strategy to identify and use the most difficult negative samples to improve contrastive learning.

6 termini

📂

sottocategorie

Temperature Scaling

Crucial parameter controlling the concentration of the distribution in contrastive loss functions.

4 termini

📂

sottocategorie

Multimodal Fusion Strategies

Different approaches for combining information from multiple modalities before or after contrast.

12 termini

📂

sottocategorie

Self-Supervised Pre-training

Use of contrastive learning to pre-train models without supervised annotations.

1 termini

📂

sottocategorie

Contrastive Learning for Audio-Text

Extending contrastive methods to audio-text pairs for applications such as transcription and audio search.

15 termini

Glossario IA

CLIP (Contrastive Language-Image Pre-training)

ALIGN (A Large-scale ImaGe and Noisy-text Embedding)

SimCLR (Simple Contrastive Learning)

MoCo (Momentum Contrast)

BYOL (Bootstrap Your Own Latent)

InfoNCE Loss

Triplet Loss

Cross-Modal Retrieval

Multimodal Data Augmentation

Vision Transformers in Contrastive Learning

Hard Negative Mining

Temperature Scaling

Multimodal Fusion Strategies

Self-Supervised Pre-training

Contrastive Learning for Audio-Text

Nessun risultato trovato