🏠 Hem
Benchmarkar
📊 Alla benchmarkar 🦖 Dinosaur v1 🦖 Dinosaur v2 ✅ To-Do List-applikationer 🎨 Kreativa fria sidor 🎯 FSACB - Ultimata uppvisningen 🌍 Översättningsbenchmark
Modeller
🏆 Topp 10 modeller 🆓 Gratis modeller 📋 Alla modeller ⚙️ Kilo Code
Resurser
💬 Promptbibliotek 📖 AI-ordlista 🔗 Användbara länkar

AI-ordlista

Den kompletta ordlistan över AI

162
kategorier
2 032
underkategorier
23 060
termer
📖
termer

RMSprop

Adaptive optimization technique that divides the learning rate by an exponential moving average of the squares of recent gradients to handle large-magnitude gradients.

📖
termer

Adagrad

Adaptive optimization algorithm that adapts the learning rate of each parameter by accumulating the squares of historical gradients, favoring infrequent parameters.

📖
termer

Adadelta

Extension of Adagrad that solves the problem of the learning rate's drastic decay by limiting the window of past gradients to a fixed size via an exponential moving average.

📖
termer

Adamax

Variant of Adam based on the infinity norm instead of the L2 norm, offering greater numerical stability and more robust convergence in some scenarios.

📖
termer

Nadam

Combination of Nesterov accelerated gradient and Adam that incorporates Nesterov's acceleration into Adam's adaptive framework for faster and more stable convergence.

📖
termer

AMSGrad

Modification of Adam that guarantees theoretical convergence by retaining the maximum of the exponential moving averages of the squared gradients to avoid Adam's potential divergences.

📖
termer

AdamW

Variant of Adam that decouples weight decay from the adaptive update, applying decay directly to the weights rather than to the gradients.

📖
termer

SGDW

Extension of SGD with decoupled weight decay that applies weight decay independently of the gradient update for better regularization.

📖
termer

RAdam

Rectified Adam that solves the problem of high variance in the initial training phases by introducing an adaptive rectification mechanism.

📖
termer

YellowFin

Optimizer that automatically adjusts the learning rate and momentum coefficient using a theoretical analysis of the local convergence of second-order methods.

📖
termer

LARS

Layer-wise Adaptive Rate Scaling that adapts the learning rate per layer based on the ratio between the L2 norm of weights and gradients for large-scale training.

📖
termer

LAMB

Layer-wise Adaptive Moments optimizer for Batch training that extends LARS by integrating Adam-type adaptive statistics for efficient training of massive models.

📖
termer

Rprop

Resilient Backpropagation that adapts the learning rate per parameter by ignoring the magnitude of the gradient and considering only its sign for robust updates.

📖
termer

QHAdam

Quasi-Hyperbolic Adam that generalizes Adam and Momentum by introducing quasi-hyperbolicity parameters for fine control of the moment contributions.

🔍

Inga resultat hittades