Adaptive Learning Rate Methods
Beta Parameters (Adam)
Hyperparameters β1 and β2 that respectively control the exponential decay rates for the moving average of the gradient (first-order moment) and its variance (second-order moment).
← Indietro