Knowledge Distillation
Softmax temperature
Scaling parameter applied to the softmax function to soften probability distributions, thereby revealing the relationships between classes that the teacher model has learned.
← Indietro