Glossario IA
Il dizionario completo dell'Intelligenza Artificiale
Linear Regression
Statistical model that establishes a linear relationship between a dependent variable and one or more independent variables by minimizing the sum of squared residuals. This model is considered a white box because the coefficients can be directly interpreted as the impact of each variable on the prediction.
K-Nearest Neighbors (KNN)
Supervised learning algorithm that classifies a new sample based on the majority class of its k nearest neighbors in the feature space. This model is fully interpretable because predictions can be explained by explicitly showing the neighbors used for the decision.
Association Rules
Method for discovering relationships between variables in large databases, typically represented in IF-THEN form with support and confidence measures. These rules are inherently interpretable because they directly express understandable logical relationships between attributes.
Generalized Linear Model (GLM)
Extension of linear regression that allows for response distributions other than normal and nonlinear link functions, while maintaining an additive structure. GLMs remain interpretable because coefficients can be transformed to reveal the marginal effect of each predictor.
Generalized Additive Model (GAM)
Extension of GLMs where the prediction is a sum of smooth functions of individual variables rather than linear terms. GAMs offer high interpretability because they allow visualization of the separate effect of each variable on the prediction while capturing nonlinear relationships.
Linear Discriminant Analysis (LDA)
Classification method that seeks to find a linear combination of features that best separates two or more classes by maximizing the ratio of between-class variance to within-class variance. Interpretability comes from the eigenvectors that indicate the most discriminative directions in the feature space.
CART Trees
Decision tree construction algorithm that uses the Gini index for classification and mean squared error for regression, with binary splits at each node. The binary structure of CART trees facilitates interpretation of decision paths and extracted rules.
ID3 Algorithm
Historical decision tree construction algorithm that uses information gain based on entropy to select splitting attributes. ID3 produces highly interpretable trees where each path represents a clear decision rule based on binary or multi-class tests.
C4.5 Algorithm
Improvement of the ID3 algorithm that uses the information gain ratio to avoid bias towards attributes with many values, and handles continuous attributes and missing values. C4.5 generates optimized decision trees while preserving complete interpretability of the decision process.
CHAID Algorithm
Decision tree construction algorithm that uses chi-square tests for categorical variables and F-tests for continuous variables, with multi-way splits rather than binary ones. CHAID produces particularly interpretable trees for survey and marketing data.
Decision List
Classification structure represented as an ordered sequence of IF-THEN rules, where each rule is tested sequentially until one is satisfied. Decision lists offer superior interpretability to trees because they present a linear decision flow rather than a complex tree structure.
Rule-based Model
Classification or regression system that uses a set of logical rules to make predictions, often organized as a covering set or decision list. These models are among the most interpretable because each prediction can be explained by one or more explicit rules understandable by non-experts.
Simple Perceptron
Binary linear classification algorithm that learns a separating hyperplane by iteratively adjusting weights based on classification errors. Although simple, the perceptron remains interpretable because the weights can be examined to understand the importance and direction of each feature's influence.
Poisson Regression
Regression model for count data that assumes the response variable follows a Poisson distribution, with a logarithm link for the mean function. The model's exponential coefficients allow direct interpretation as multipliers of expected event rates.
Stochastic Gradient Boosting (SGB)
Ensemble method that combines simple interpretable models (often shallow trees) by sequentially building each new model to correct the errors of the previous one. Although powerful, SGB with shallow trees retains some interpretability through the contributions of each individual tree.