Knowledge Distillation - Glossariusz AI

📖

pojęcia

Teacher Model

Large and complex pre-trained neural model that serves as a knowledge source to train a more compact model through the distillation process.

📖

pojęcia

Student Model

Smaller neural model that learns to imitate the behavior of the teacher model, benefiting from its generalizations while being more computationally efficient.

📖

pojęcia

Soft Targets

Output probabilities from the teacher model before applying the argmax function, containing information about inter-class relationships that hard labels don't capture.

📖

pojęcia

Temperature Scaling

Technique of adjusting logits by dividing by a temperature parameter to soften the probability distribution and reveal inter-class relationships during distillation.

📖

pojęcia

Hard Targets

Traditional ground truth labels (one-hot encoded) used together with soft targets to maintain prediction accuracy during distillation.

📖

pojęcia

Dark Knowledge

Subtle information contained in the teacher model's output probabilities that reveals similarities between classes and is not present in hard labels.

📖

pojęcia

Distillation Loss

Combined loss function that measures both the divergence between soft predictions of the student and teacher, and accuracy with respect to hard labels.

📖

pojęcia

Feature Distillation

Variant of distillation where the student learns to reproduce the teacher's intermediate representations (features) rather than just the final predictions.

📖

pojęcia

Relational Knowledge Distillation

Approach where the student learns the structural relationships between training samples preserved by the teacher, beyond individual predictions.

📖

pojęcia

Self-Knowledge Distillation

Technique where a model self-distills by using its own knowledge at different training stages or different branches to improve its performance.

📖

pojęcia

Multi-Teacher Distillation

Strategy using multiple teacher models to transfer diversified knowledge to a single student, combining their respective expertise.

📖

pojęcia

Online Distillation

Method where teacher and student models are trained simultaneously, allowing dynamic and adaptive knowledge transfer during the learning process.

📖

pojęcia

Zero-Shot Knowledge Distillation

Approach that allows distilling knowledge from a teacher without requiring training data, using only the pre-trained model weights.

📖

pojęcia

Attention-Based Distillation

Specific technique where the student learns to reproduce the teacher's attention maps, thus transferring knowledge about the important parts of the input data.

📖

pojęcia

Structural Knowledge Distillation

Method that preserves the teacher's structure and architecture in the student, maintaining the relationships between layers and original information flows.

📖

pojęcia

Progressive Knowledge Distillation

Multi-step strategy where an intermediate model serves as a teacher for the final student, allowing a smooth transition of knowledge.

📖

pojęcia

Knowledge Purification

Process of filtering noisy or incorrect knowledge from the teacher before distillation, ensuring a higher quality knowledge transfer to the student.

📖

pojęcia

Heterogeneous Knowledge Distillation

Approach where teacher and student have different architectures (CNN to Transformer, for example), requiring specific adaptation techniques for knowledge transfer.

Słownik AI

Teacher Model

Student Model

Soft Targets

Temperature Scaling

Hard Targets

Dark Knowledge

Distillation Loss

Feature Distillation

Relational Knowledge Distillation

Self-Knowledge Distillation

Multi-Teacher Distillation

Online Distillation

Zero-Shot Knowledge Distillation

Attention-Based Distillation

Structural Knowledge Distillation

Progressive Knowledge Distillation

Knowledge Purification

Heterogeneous Knowledge Distillation

Nie znaleziono wyników