Feature Importance

📖

istilah

Metric quantifying the influence of each predictive variable in the performance of a Random Forest model, calculated either by average impurity reduction or by random permutation.

📖

istilah

Gini Importance

Method for evaluating variable importance based on the total decrease in Gini impurity accumulated across all nodes where the variable is used to split.

📖

istilah

Mean Decrease Impurity

Technique measuring the importance of a variable by the average impurity reduction (Gini or entropy) it provides when used as a splitting criterion in trees.

📖

istilah

Permutation Importance

Model-agnostic method evaluating the importance of a variable by measuring the degradation in model performance when the values of this variable are randomly permuted.

📖

istilah

Mean Decrease Accuracy

Indicator of a variable's importance based on the average decrease in model accuracy when this variable is permuted in the out-of-bag data.

📖

istilah

Impurity Measure

Mathematical function quantifying the degree of class heterogeneity in a node, used to optimize splits in decision trees.

📖

istilah

Information Gain

Splitting criterion measuring the reduction in entropy obtained by partitioning a node according to a specific feature, favoring splits that maximize the resulting homogeneity.

📖

istilah

Gini Index

Impurity measure calculating the probability that a randomly classified observation would be incorrect, evaluating class heterogeneity in a decision tree node.

📖

istilah

Out-of-Bag Error

Unbiased error estimate calculated by evaluating each tree on observations not used during its training, serving as internal cross-validation in Random Forest.

📖

istilah

Feature Selection

Process of identifying and keeping the most relevant variables based on their importance scores, eliminating redundant or non-informative features.

📖

istilah

Variable Importance Plot

Visualization ordering predictive variables by their decreasing importance score, facilitating the interpretation of the model's most influential factors.

📖

istilah

Partial Dependence Plot

Graphical representation showing the marginal effect of one or two variables on the model's prediction, averaging over all other variables.

📖

istilah

Node Impurity

Degree of heterogeneity of observations in a tree node, serving as the basis for calculating feature importance through their contribution to reducing this impurity.

📖

istilah

Split Criterion

Rule determining the optimal division of a node based on a feature and a threshold, directly impacting the distribution of importance among variables.

Glosarium AI