Policy Gradient Methods
Actor-Critic Methods
Hybrid approach combining an actor that learns the policy and a critic that estimates the value function, reducing the variance of policy gradient estimates.
← Geri