🏠 Home
Benchmark
📊 Tutti i benchmark 🦖 Dinosauro v1 🦖 Dinosauro v2 ✅ App To-Do List 🎨 Pagine libere creative 🎯 FSACB - Ultimate Showcase 🌍 Benchmark traduzione
Modelli
🏆 Top 10 modelli 🆓 Modelli gratuiti 📋 Tutti i modelli ⚙️ Kilo Code
Risorse
💬 Libreria di prompt 📖 Glossario IA 🔗 Link utili

Glossario IA

Il dizionario completo dell'Intelligenza Artificiale

162
categorie
2.032
sottocategorie
23.060
termini
📖
termini

Teacher Model

Large and complex pre-trained neural model that serves as a knowledge source to train a more compact model through the distillation process.

📖
termini

Student Model

Smaller neural model that learns to imitate the behavior of the teacher model, benefiting from its generalizations while being more computationally efficient.

📖
termini

Soft Targets

Output probabilities from the teacher model before applying the argmax function, containing information about inter-class relationships that hard labels don't capture.

📖
termini

Temperature Scaling

Technique of adjusting logits by dividing by a temperature parameter to soften the probability distribution and reveal inter-class relationships during distillation.

📖
termini

Hard Targets

Traditional ground truth labels (one-hot encoded) used together with soft targets to maintain prediction accuracy during distillation.

📖
termini

Dark Knowledge

Subtle information contained in the teacher model's output probabilities that reveals similarities between classes and is not present in hard labels.

📖
termini

Distillation Loss

Combined loss function that measures both the divergence between soft predictions of the student and teacher, and accuracy with respect to hard labels.

📖
termini

Feature Distillation

Variant of distillation where the student learns to reproduce the teacher's intermediate representations (features) rather than just the final predictions.

📖
termini

Relational Knowledge Distillation

Approach where the student learns the structural relationships between training samples preserved by the teacher, beyond individual predictions.

📖
termini

Self-Knowledge Distillation

Technique where a model self-distills by using its own knowledge at different training stages or different branches to improve its performance.

📖
termini

Multi-Teacher Distillation

Strategy using multiple teacher models to transfer diversified knowledge to a single student, combining their respective expertise.

📖
termini

Online Distillation

Method where teacher and student models are trained simultaneously, allowing dynamic and adaptive knowledge transfer during the learning process.

📖
termini

Zero-Shot Knowledge Distillation

Approach that allows distilling knowledge from a teacher without requiring training data, using only the pre-trained model weights.

📖
termini

Attention-Based Distillation

Specific technique where the student learns to reproduce the teacher's attention maps, thus transferring knowledge about the important parts of the input data.

📖
termini

Structural Knowledge Distillation

Method that preserves the teacher's structure and architecture in the student, maintaining the relationships between layers and original information flows.

📖
termini

Progressive Knowledge Distillation

Multi-step strategy where an intermediate model serves as a teacher for the final student, allowing a smooth transition of knowledge.

📖
termini

Knowledge Purification

Process of filtering noisy or incorrect knowledge from the teacher before distillation, ensuring a higher quality knowledge transfer to the student.

📖
termini

Heterogeneous Knowledge Distillation

Approach where teacher and student have different architectures (CNN to Transformer, for example), requiring specific adaptation techniques for knowledge transfer.

🔍

Nessun risultato trovato