Gaussian Mixture Models (GMM)

📖

terms

Gaussian Mixture Model

Probabilistic approach modeling a dataset as a linear combination of multiple Gaussian distributions to identify latent structures.

📖

terms

EM Algorithm

Iterative parameter estimation method maximizing likelihood in models with latent variables, alternating between E-step (expectation) and M-step (maximization).

📖

terms

Multivariate Gaussian Distribution

Generalization of the normal distribution to multiple dimensions, characterized by a mean vector and covariance matrix defining the probability ellipsoid.

📖

terms

Log-Likelihood

Logarithm of the likelihood function used to avoid numerical underflows and simplify maximization calculations in GMM training.

📖

terms

Akaike Information Criterion

Evaluation metric penalizing model complexity to balance data fit and parsimony in selecting the optimal number of components.

📖

terms

Bayesian Information Criterion

Model selection criterion stricter than AIC, applying stronger penalization on the number of parameters to favor simpler models.

📖

terms

Probabilistic Clustering

Partitioning approach assigning membership probabilities to each cluster rather than binary assignments, enabling soft classification of data.

📖

terms

Covariance Degeneracy

Numerical problem where the covariance matrix becomes singular, requiring regularization techniques or constraints on covariance structure.

📖

terms

Mixture Weights

Parameters πk representing the expected proportion of data in each Gaussian component, constrained to be positive and sum to one.

📖

terms

K-means++ Initialization

Smart initialization strategy for the EM algorithm using K-means++ to disperse initial centers and avoid suboptimal local minima.

📖

terms

Diagonal Regularization

Technique adding a small positive value to the diagonal of covariance matrices to ensure their invertibility and numerical stability.

📖

terms

Algorithm Convergence

Stopping criterion based on the relative change in log-likelihood between successive iterations or on a predefined maximum number of iterations.

📖

terms

Optimal Number of Components

Determination of the optimal parameter K balancing model complexity and goodness-of-fit to the data via cross-validation or information criteria.

📖

terms

Mixture Density

Resulting probability density function from the weighted combination of the individual densities of each Gaussian component of the model.

📖

terms

Responsibilities

Values γ(z_nk) representing the probability that observation n belongs to component k, calculated during the E-step of the EM algorithm.

📖

terms

Method of Moments

Alternative estimation technique to EM using the empirical moments of the data to initialize the mixture model parameters.

AI Glossary