Actor-Critic Methods
Twin Delayed Deep Deterministic Policy Gradient
Improvement over DDPG using twin critics to reduce value overestimation and delayed updates of the actor and targets for better stability.
← Geri