Glossario IA
Il dizionario completo dell'Intelligenza Artificiale
K-Fold Cross-Validation
Model evaluation technique that divides the dataset into K equal partitions, where each partition serves in turn as the test set while the other K-1 serve as training. This method provides a more robust estimate of model performance by reducing evaluation variance.
Stratified K-Fold Cross-Validation
Variant of K-Fold that maintains the class distribution in each partition, essential for imbalanced datasets. This approach ensures that each fold faithfully represents the overall class distribution of the original dataset.
Holdout Method
Simple evaluation method dividing the dataset into two distinct sets: training and test, typically with ratios of 70/30 or 80/20. Although quick to implement, this method can produce biased performance estimates depending on how the data is partitioned.
Repeated Cross-Validation
Technique repeating the K-Fold process multiple times with different random partitions to reduce performance estimation variance. This approach combines the advantages of K-Fold with greater statistical robustness at the cost of increased computational expense.
Bootstrap Validation
Evaluation method using sampling with replacement to create multiple training and test sets from the original data. Bootstrap allows estimating the variance of model performance and is particularly useful with small datasets.
Grid Search with Cross-Validation
Systematic optimization technique exhaustively testing all specified hyperparameter combinations using cross-validation to evaluate each configuration. This method ensures finding the best combination within the defined grid but can be very computationally expensive.
Randomized Search with Cross-Validation
Alternative to Grid Search that randomly samples a fixed number of hyperparameter combinations rather than exhaustively exploring all possibilities. This approach is often more efficient for finding good hyperparameters with fewer evaluations than Grid Search.
Learning Curve
Graph showing the evolution of model performance as a function of training set size, used to diagnose overfitting or underfitting. Learning curves help determine whether additional data could improve model performance.
Validation Curve
Diagnostic tool visualizing the impact of a single hyperparameter on training and validation performance. Validation curves help identify optimal hyperparameter values and detect bias-variance issues.
Cross-Entropy
Loss function measuring the divergence between two probability distributions, widely used in classification problems. Cross-entropy penalizes incorrect predictions more heavily when they are confident, making it an excellent training metric.
Mean Squared Error
Evaluation metric calculating the average of squared differences between predicted and actual values, particularly sensitive to large errors. MSE is commonly used for regression problems and penalizes significant errors more than MAE.
Mean Absolute Error
Regression metric measuring the average of absolute values of errors between predictions and actual values, offering direct interpretation in target variable units. Unlike MSE, MAE is less sensitive to outliers and represents the mean absolute error.
R² Score
Coefficient of determination measuring the proportion of target variable variance explained by the model, ranging from -∞ to 1. An R² of 1 indicates perfect prediction, while negative values suggest the model performs worse than a simple average.
F1-Score
Classification metric calculating the harmonic mean of precision and recall, particularly useful for imbalanced datasets. The F1-Score balances the model's ability to avoid false positives and false negatives in a single measure.
Precision-Recall Curve
Graph illustrating the trade-off between precision and recall for different classification thresholds, essential for evaluating models on imbalanced data. The area under this curve (AUC-PR) provides an aggregated performance measure independent of threshold.
ROC Curve
Curve representing the true positive rate against the false positive rate at various decision thresholds, visualizing the model's discrimination capability. The ROC curve and its area (AUC-ROC) are standards for evaluating overall binary classifier performance.
AUC Score
Area under the ROC curve measuring the probability that a classifier gives a higher score to a random positive instance than to a negative one. AUC provides a threshold-independent performance measure, particularly useful for comparing different models.
Group K-Fold Cross-Validation
A variant of K-Fold that ensures the same groups never appear in different training and test sets simultaneously. This approach is crucial when data has a group structure (patients, users) where observations from the same group are correlated.