Actor-Critic Methods
Actor-Critic
Reinforcement learning architecture combining an actor network that learns a stochastic policy and a critic network that estimates the value function to reduce the policy gradient variance.
← Back