🏠 Beranda
Benchmark
📊 Semua Benchmark 🦖 Dinosaurus v1 🦖 Dinosaurus v2 ✅ Aplikasi To-Do List 🎨 Halaman Bebas Kreatif 🎯 FSACB - Showcase Utama 🌍 Benchmark Terjemahan
Model
🏆 Top 10 Model 🆓 Model Gratis 📋 Semua Model ⚙️ Kilo Code
Sumber Daya
💬 Perpustakaan Prompt 📖 Glosarium AI 🔗 Tautan Berguna

Glosarium AI

Kamus lengkap Kecerdasan Buatan

162
kategori
2.032
subkategori
23.060
istilah
📖
istilah

Actor-Critic

Reinforcement learning architecture combining an actor network that learns a stochastic policy and a critic network that estimates the value function to reduce the policy gradient variance.

📖
istilah

Value Function

Mathematical function estimating the expected cumulative return from a state or state-action pair, serving as the learning signal for the critic in the Actor-Critic architecture.

📖
istilah

Asynchronous Advantage Actor-Critic

Distributed architecture where multiple agents train in parallel on independent environments, periodically sharing their gradients to accelerate learning.

📖
istilah

Deep Deterministic Policy Gradient

Actor-Critic algorithm for continuous action spaces using deep neural networks with deterministic policy and replay buffer for stable off-policy learning.

📖
istilah

Twin Delayed Deep Deterministic Policy Gradient

Improvement over DDPG using twin critics to reduce value overestimation and delayed updates of the actor and targets for better stability.

📖
istilah

Soft Actor-Critic

Actor-Critic algorithm maximizing an entropy-augmented reward combining return and entropy to encourage exploration, using stable and efficient off-policy updates.

📖
istilah

Advantage Actor-Critic

Synchronous variant of A3C using advantage estimation to reduce policy gradient variance, with batch updates for better stability on GPU.

📖
istilah

Critic Network

Neural network estimating the value function V(s) or Q(s,a) to provide the TD learning signal to the actor, using prediction error as optimization gradient.

🔍

Tidak ada hasil ditemukan