Interpretability Evaluation Metrics

📖

istilah

Fidelity

The extent to which an explanation faithfully reflects the model's internal reasoning, evaluating whether the explanation's predictions match those of the model on perturbed data.

📖

istilah

Comprehensibility

A subjective or objective measure of how easily a human can understand an explanation, often related to the complexity of the explanation model (e.g., number of rules, depth of a tree).

📖

istilah

Sufficiency

The ability of a subset of features, identified by an explanation, to maintain the model's original prediction, indicating that these features are sufficient to justify the decision.

📖

istilah

Necessity

Evaluates whether the absence of a feature (or set of features) identified as important by the explanation significantly changes the model's prediction.

📖

istilah

Causal Inference Score (CIS)

A metric quantifying an explanation's ability to identify actual causal relationships rather than mere correlations, by testing the effects of interventions on variables.

📖

istilah

Explanation Robustness

Measures the variation in explanations when the model or input data undergo adversarial attacks or noise, assessing the interpretation's resistance to manipulation.

📖

istilah

Feature Coherence

Evaluates whether the features deemed important by an explanation are semantically or logically coherent with each other, enhancing the plausibility of the overall explanation.

📖

istilah

Selectivity Rate

An indicator measuring the proportion of features or rules used by an explanation relative to the total available, favoring parsimonious explanations.

📖

istilah

Relevance Function

Mathematical function that quantifies the contribution of a feature or set of features to the model's final prediction, serving as the basis for many interpretability metrics.

📖

istilah

Inter-Annotator Agreement

Statistical measure (e.g., Cohen's Kappa score) assessing the level of consensus among different human experts on the quality or correctness of an explanation, validating its subjectivity.

📖

istilah

Confirmation Bias

Metric evaluating whether an explanation only reinforces the user's pre-existing beliefs without challenging the model, measuring the risk of fallacious interpretations.

📖

istilah

Discriminative Power

Ability of an explanation to clearly distinguish features that positively influence the prediction from those that negatively influence it, improving interpretation clarity.

📖

istilah

Global Fidelity

Evaluates an explanation's ability to faithfully represent the model's overall behavior across the entire data space, often at the expense of local accuracy.

📖

istilah

Counterfactual Score

Metric assessing the quality of a counterfactual explanation based on the minimal perturbation required to change the model's prediction and the plausibility of the generated scenario.

📖

istilah

Semantic Depth

Measures the level of abstraction of an explanation, quantifying whether it is based on low-level features (pixels) or higher-level concepts (objects, ideas) that are more intelligible.

Glosarium AI

Fidelity

Comprehensibility

Sufficiency

Necessity

Causal Inference Score (CIS)

Explanation Robustness

Feature Coherence

Selectivity Rate

Relevance Function

Inter-Annotator Agreement

Confirmation Bias

Discriminative Power

Global Fidelity

Counterfactual Score

Semantic Depth

Tidak ada hasil ditemukan