Contrastive interpretability - 인공지능 용어집

📖

용어

Counterfactual Analysis

Interpretability method that identifies the minimal changes to an instance's features needed to change the model's prediction to a desired output, creating hypothetical 'what if' scenarios.

📖

용어

Explanation by Opposition

Technique that explains a prediction by contrasting it with another plausible prediction, highlighting the features that justify why the model chose one option over another.

📖

용어

Local explanations in the form of conditional rules that 'anchor' a model's prediction around a specific instance, ensuring the prediction remains unchanged within a neighborhood defined by these rules.

📖

용어

Pairwise Contrastive Explanation

Approach that generates explanations by directly comparing two distinct instances, often a correct and an incorrect prediction, to isolate the decisive factors in the model's decision.

📖

용어

Multiple Counterfactual Scenarios

Generation of a set of varied counterfactual explanations for a single prediction, offering a richer view of alternative paths and different ways to achieve a different outcome.

📖

용어

Actionable Counterfactual Space

Set of possible modifications to an instance's features that are both feasible in the real world and relevant to the user, constraining the generation of counterfactual explanations to plausible scenarios.

📖

용어

Diagnosis by Difference

Method that analyzes discrepancies in a model's behavior between two contexts or datasets (e.g., before and after drift) to understand changes in its decision logic.

📖

용어

Prototypical Case Explanation

Technique that explains a prediction by contrasting it with the most representative case (prototype) of the predicted class, highlighting features that bring the instance closer to or further from this prototype.

📖

용어

Comparative Sensitivity Analysis

Evaluation that measures how a model's prediction for a given instance reacts to variations in its features, comparing this reaction to that of other instances or to a reference case.

📖

용어

Model Substitution Explanation

Approach that interprets a complex model's decision (black box) by comparing it to the decision of a simpler, interpretable model (e.g., decision tree) trained on the same local neighborhood.

📖

용어

Deviation Map from the Norm

Visualization that highlights the features of an instance that deviate significantly from the 'normal' or expected data distribution, thereby explaining why its prediction is atypical.

📖

용어

Contrastive Case Reasoning

Explanation methodology inspired by legal reasoning, where a decision is justified by comparing it to similar past cases with different outcomes, to isolate the 'decisive factor' element.

📖

용어

Exclusion Criterion Explanation

Technique that explains why a model did not choose a certain class by identifying the features that actively led to the exclusion of that option in the decision process.

📖

용어

Contrastive Decision Boundary Analysis

Examination of the features that define the boundary between two classes predicted by the model, focusing on instances near this limit to understand the tipping factors.

📖

용어

Prediction Inversion Explanation

Process that involves reversing a model's prediction (e.g., from 'rejected' to 'accepted') and analyzing the necessary feature changes, providing an explanation of what the original instance lacks.

📖

용어

Attribution Profile Comparison

Method that compares feature importance profiles (generated by SHAP, LIME, etc.) between two instances or groups of instances to reveal subtle differences in the model's logic.

📖

용어

Adverse Scenario Explanation

Generation of a 'worst-case' scenario that, although close to the original instance, leads to a very unfavorable prediction, helping to understand the robustness and weaknesses of the model's decision.

AI 용어집