Scaling Laws
Sharpness-Aware Minimization
Optimization technique seeking flat minima in the loss landscape, particularly important for the stability of large models.
← GeriOptimization technique seeking flat minima in the loss landscape, particularly important for the stability of large models.
← Geri