Glossario IA
Il dizionario completo dell'Intelligenza Artificiale
Mini-batch Gradient Descent
Variant of SGD that uses a small subset of data (batch) to compute the gradient at each iteration, offering a compromise between pure SGD and batch gradient descent.
Momentum
SGD acceleration technique that adds a fraction of the previous update vector to the current vector to overcome oscillations and accelerate convergence in relevant directions.
Learning Rate Schedule
Strategy that dynamically adjusts the learning rate during training to improve convergence, including step decay, exponential decay, and cosine annealing approaches.
Exploding Gradient Problem
Problem where gradients become excessively large during training, causing unstable parameter updates and divergence of the learning algorithm.
Local Minima
Point in the parameter space where the loss function reaches a minimum value in a local neighborhood, but not necessarily the global minimum.
Global Optima
Point in the parameter space where the loss function reaches its lowest value over the entire domain, representing the optimal solution for the optimization problem.
Nesterov Accelerated Gradient
Improved variant of momentum that computes the gradient at an anticipated position rather than the current position, offering theoretically faster convergence.