Deep Optimization
LARS Optimizer (Layer-wise Adaptive Rate Scaling)
Optimization method that adapts the learning rate for each layer based on the ratio between the norm of weights and the norm of gradients, particularly suitable for training with large batches.
← Indietro