Glosarium AI
Kamus lengkap Kecerdasan Buatan
Deep Q-Networks (DQN)
Pioneering algorithm combining Q-learning with deep neural networks to approximate the Q-value function in complex state spaces.
Policy Gradient Methods
Reinforcement learning approaches that directly optimize the policy by following the gradient of expected rewards.
Actor-Critic Methods
Hybrid architecture combining an actor that learns the policy and a critic that evaluates the value of states or actions.
Deep Deterministic Policy Gradient (DDPG)
Off-policy actor-critic algorithm for environments with continuous action spaces using deep neural networks.
Proximal Policy Optimization (PPO)
Policy optimization method that maintains updates in a trust region to ensure learning stability.
Trust Region Policy Optimization (TRPO)
Constrained optimization algorithm that ensures new policies do not deviate too much from old policies.
Multi-Agent Deep RL
Extension of deep RL where multiple agents learn simultaneously, in cooperation or competition in a shared environment.
Hierarchical Reinforcement Learning
Approach structuring learning in hierarchical levels with meta-policies controlling specialized sub-policies.
Model-Based Deep RL
Technique where the agent learns a model of the environment to plan and make more efficient decisions.
Distributional RL
Paradigm learning the complete distribution of returns rather than just their expectation for better robustness.
Curiosity-Driven RL
Approach where the agent receives intrinsic rewards based on its curiosity to efficiently explore the environment.
Meta-Learning in RL
Technique that allows agents to learn to learn quickly on new tasks with few experiences.