Actor-Critic Methods
Critic Network
Neural network estimating the value function V(s) or Q(s,a) to provide the TD learning signal to the actor, using prediction error as optimization gradient.
← Quay lại