Adaptive Learning Rate Methods
AdaDelta
Extension of AdaGrad that solves the problem of monotonically decreasing learning rates by using a sliding window of past gradients instead of the accumulated sum.
← Indietro