Glosarium AI
Kamus lengkap Kecerdasan Buatan
Off-Policy Learning
Learning method where the agent learns an optimal policy while following another behavior policy, allowing for better exploration.
Target Networks
Duplicated neural networks with slowly updated weights to stabilize learning by providing more consistent targets.
Ornstein-Uhlenbeck Process
Stochastic process used to generate temporally correlated noise in actions, promoting efficient exploration in continuous spaces.
Continuous Action Space
Environment where actions can take any value in a continuous interval, requiring adapted algorithms unlike discrete actions.
Neural Network Function Approximation
Use of neural networks to approximate complex functions like policies or value functions in reinforcement learning.
Soft Update
Method of gradually updating target networks using a tau coefficient (τ) to slowly mix the weights of main and target networks.
Actor Network
Neural network learning to directly map states to optimal actions in a continuous action space.
Deterministic Policy
Policy that associates a specific action with each state, unlike stochastic policies that return probability distributions.
Action Noise
Noise added to the actions produced by the actor to encourage exploration of the continuous action space during training.