Stochastic Gradient Descent (SGD)
Learning Rate Schedule
Strategy that dynamically adjusts the learning rate during training to improve convergence, including step decay, exponential decay, and cosine annealing approaches.
← Indietro