Momentum-based Optimization - Bảng thuật ngữ Trí tuệ nhân tạo

📖

thuật ngữ

RMSprop

Adaptive optimization technique that divides the learning rate by an exponential moving average of the squares of recent gradients to handle large-magnitude gradients.

📖

thuật ngữ

Adagrad

Adaptive optimization algorithm that adapts the learning rate of each parameter by accumulating the squares of historical gradients, favoring infrequent parameters.

📖

thuật ngữ

Adadelta

Extension of Adagrad that solves the problem of the learning rate's drastic decay by limiting the window of past gradients to a fixed size via an exponential moving average.

📖

thuật ngữ

Adamax

Variant of Adam based on the infinity norm instead of the L2 norm, offering greater numerical stability and more robust convergence in some scenarios.

📖

thuật ngữ

Nadam

Combination of Nesterov accelerated gradient and Adam that incorporates Nesterov's acceleration into Adam's adaptive framework for faster and more stable convergence.

📖

thuật ngữ

AMSGrad

Modification of Adam that guarantees theoretical convergence by retaining the maximum of the exponential moving averages of the squared gradients to avoid Adam's potential divergences.

📖

thuật ngữ

AdamW

Variant of Adam that decouples weight decay from the adaptive update, applying decay directly to the weights rather than to the gradients.

📖

thuật ngữ

SGDW

Extension of SGD with decoupled weight decay that applies weight decay independently of the gradient update for better regularization.

📖

thuật ngữ

RAdam

Rectified Adam that solves the problem of high variance in the initial training phases by introducing an adaptive rectification mechanism.

📖

thuật ngữ

YellowFin

Optimizer that automatically adjusts the learning rate and momentum coefficient using a theoretical analysis of the local convergence of second-order methods.

📖

thuật ngữ

LARS

Layer-wise Adaptive Rate Scaling that adapts the learning rate per layer based on the ratio between the L2 norm of weights and gradients for large-scale training.

📖

thuật ngữ

LAMB

Layer-wise Adaptive Moments optimizer for Batch training that extends LARS by integrating Adam-type adaptive statistics for efficient training of massive models.

📖

thuật ngữ

Rprop

Resilient Backpropagation that adapts the learning rate per parameter by ignoring the magnitude of the gradient and considering only its sign for robust updates.

📖

thuật ngữ

QHAdam

Quasi-Hyperbolic Adam that generalizes Adam and Momentum by introducing quasi-hyperbolicity parameters for fine control of the moment contributions.

Thuật ngữ AI

RMSprop

Adagrad

Adadelta

Adamax

Nadam

AMSGrad

AdamW

SGDW

RAdam

YellowFin

LARS

LAMB

Rprop

QHAdam

Không tìm thấy kết quả