AI Glossary
The complete dictionary of Artificial Intelligence
Gaussian Mixture Model
Probabilistic approach modeling a dataset as a linear combination of multiple Gaussian distributions to identify latent structures.
EM Algorithm
Iterative parameter estimation method maximizing likelihood in models with latent variables, alternating between E-step (expectation) and M-step (maximization).
Multivariate Gaussian Distribution
Generalization of the normal distribution to multiple dimensions, characterized by a mean vector and covariance matrix defining the probability ellipsoid.
Log-Likelihood
Logarithm of the likelihood function used to avoid numerical underflows and simplify maximization calculations in GMM training.
Akaike Information Criterion
Evaluation metric penalizing model complexity to balance data fit and parsimony in selecting the optimal number of components.
Bayesian Information Criterion
Model selection criterion stricter than AIC, applying stronger penalization on the number of parameters to favor simpler models.
Probabilistic Clustering
Partitioning approach assigning membership probabilities to each cluster rather than binary assignments, enabling soft classification of data.
Covariance Degeneracy
Numerical problem where the covariance matrix becomes singular, requiring regularization techniques or constraints on covariance structure.
Mixture Weights
Parameters πk representing the expected proportion of data in each Gaussian component, constrained to be positive and sum to one.
K-means++ Initialization
Smart initialization strategy for the EM algorithm using K-means++ to disperse initial centers and avoid suboptimal local minima.
Diagonal Regularization
Technique adding a small positive value to the diagonal of covariance matrices to ensure their invertibility and numerical stability.
Algorithm Convergence
Stopping criterion based on the relative change in log-likelihood between successive iterations or on a predefined maximum number of iterations.
Optimal Number of Components
Determination of the optimal parameter K balancing model complexity and goodness-of-fit to the data via cross-validation or information criteria.
Mixture Density
Resulting probability density function from the weighted combination of the individual densities of each Gaussian component of the model.
Responsibilities
Values γ(z_nk) representing the probability that observation n belongs to component k, calculated during the E-step of the EM algorithm.
Method of Moments
Alternative estimation technique to EM using the empirical moments of the data to initialize the mixture model parameters.