KI-Glossar
Das vollständige Wörterbuch der Künstlichen Intelligenz
True Positive (TP)
A correct result where the model positively predicts an observation that is actually positive, indicating a successful classification of the class of interest. The number of true positives is crucial for evaluating the model's ability to correctly identify relevant cases.
False Positive (FP)
A classification error where the model incorrectly predicts an observation as positive when it is actually negative, corresponding to a false alarm. False positives are particularly costly in fields like medical diagnosis or fraud detection.
Precision
A metric calculated as the ratio of true positives to the sum of true and false positives, measuring the proportion of correct positive predictions among all positive predictions. It is particularly important when the cost of false positives is high.
Recall
Also called sensitivity, it measures the ratio of true positives to the sum of true positives and false negatives, evaluating the model's ability to identify all actual positive observations. Recall is crucial when false negatives have serious consequences.
ROC Curve
A graph representing the true positive rate as a function of the false positive rate for different classification thresholds, illustrating the trade-off between sensitivity and specificity. The area under this curve (AUC) quantifies the overall performance of the classifier.
Logistic Regression
A generalized linear model using the sigmoid function to map continuous predictions to a probability between 0 and 1 in binary classification. This interpretable model is often used as a baseline for dichotomous classification problems.
Decision Threshold
A cutoff value (typically 0.5) used to convert output probabilities into binary predictions, above which an observation is classified as positive. Adjusting this threshold allows for optimizing the trade-off between precision and recall.
Class Imbalance
A situation where one class is significantly more represented than the other in the training dataset, potentially biasing the model toward the majority class. This issue requires specific techniques such as oversampling or class weighting.
SMOTE
Synthetic oversampling technique that generates new examples of the minority class through interpolation between existing instances, thus balancing the class distribution without exact duplication. SMOTE is particularly effective for improving performance on imbalanced datasets.
Binary Decision Tree
Classification algorithm that uses a hierarchical structure of binary decisions to partition the feature space into pure regions, with each leaf representing a predicted class. Decision trees offer high interpretability but are prone to overfitting.
Specificity
Measure calculated as the ratio of true negatives to the sum of true negatives and false positives, evaluating the model's ability to correctly identify negative observations. Specificity is complementary to recall and crucial in screening tests.