Decision Trees - KI-Glossar

📖

Begriffe

Decision Tree

Supervised predictive model that uses a tree-like structure to model decisions and their possible consequences through a series of tests on data features.

📖

Begriffe

Root Node

Starting point of a decision tree that represents the complete set of training data and contains the first split based on the most discriminative feature.

📖

Begriffe

Internal Node

Intermediate node in a decision tree that represents a test on a specific feature and divides the data into homogeneous subsets.

📖

Begriffe

Leaf

Terminal node of a decision tree that represents a final decision or class prediction, with no further possible splitting.

📖

Begriffe

Splitting Criterion

Quantitative method used to evaluate the quality of a split in a decision tree, aiming to maximize the homogeneity of the resulting subsets.

📖

Begriffe

Entropy

Mathematical measure of disorder or uncertainty in a dataset, used to quantify the impurity of a node in decision trees.

📖

Begriffe

Information Gain

Metric that measures the entropy reduction obtained by splitting a node according to a specific feature, used to select the best split.

📖

Begriffe

Gini Index

Impurity measure ranging between 0 and 1, calculating the probability that a randomly chosen element is incorrectly classified, an alternative to entropy in decision trees.

📖

Begriffe

Pruning

Technique for reducing the complexity of a decision tree by removing branches that provide little predictive power to prevent overfitting.

📖

Begriffe

Overfitting

Phenomenon where a model learns excessively the details and noise of the training data at the expense of its ability to generalize on new data.

📖

Begriffe

Tree Depth

Maximum number of splits from the root node to a leaf, a crucial parameter controlling the complexity and bias of the model.

📖

Begriffe

CART

Classification and Regression Trees algorithm that builds binary trees using the Gini index as a splitting criterion for classification.

📖

Begriffe

ID3

Pioneering decision tree algorithm using information gain as the splitting criterion, limited to categorical variables and binary splits.

📖

Begriffe

C4.5

Improvement on the ID3 algorithm that uses the information gain ratio to avoid bias towards features with many values.

📖

Begriffe

Target Variable

Variable to be predicted in a supervised learning problem, represented by the terminal leaves of the decision tree.

📖

Begriffe

Decision Rule

Logical set of IF-THEN conditions extracted from a path in the decision tree, allowing for the interpretation and explanation of the model's predictions.

📖

Begriffe

Variable Importance

Quantitative measure of each predictive feature's contribution to improving the purity of splits throughout the tree.

📖

Begriffe

Complexity Cost

Pruning parameter that penalizes tree size, balancing data fit and model simplicity to optimize generalization.

KI-Glossar

Decision Tree

Root Node

Internal Node

Leaf

Splitting Criterion

Entropy

Information Gain

Gini Index

Pruning

Overfitting

Tree Depth

CART

ID3

C4.5

Target Variable

Decision Rule

Variable Importance

Complexity Cost

Keine Ergebnisse gefunden