Deep Deterministic Policy Gradient (DDPG)

📖

istilah

Learning method where the agent learns an optimal policy while following another behavior policy, allowing for better exploration.

📖

istilah

Duplicated neural networks with slowly updated weights to stabilize learning by providing more consistent targets.

📖

istilah

Stochastic process used to generate temporally correlated noise in actions, promoting efficient exploration in continuous spaces.

📖

istilah

Environment where actions can take any value in a continuous interval, requiring adapted algorithms unlike discrete actions.

📖

istilah

Use of neural networks to approximate complex functions like policies or value functions in reinforcement learning.

📖

istilah

Method of gradually updating target networks using a tau coefficient (τ) to slowly mix the weights of main and target networks.

📖

istilah

Neural network learning to directly map states to optimal actions in a continuous action space.

📖

istilah

Policy that associates a specific action with each state, unlike stochastic policies that return probability distributions.

📖

istilah

Noise added to the actions produced by the actor to encourage exploration of the continuous action space during training.

Glosarium AI