Contrastive interpretability
Prediction Inversion Explanation
Process that involves reversing a model's prediction (e.g., from 'rejected' to 'accepted') and analyzing the necessary feature changes, providing an explanation of what the original instance lacks.
← 뒤로