Stochastic Gradient Descent (SGD)
Nesterov Accelerated Gradient
Improved variant of momentum that computes the gradient at an anticipated position rather than the current position, offering theoretically faster convergence.
← Geri