Policy Gradient Methods
Baseline Function
A function subtracted from the return to reduce the variance of the gradient estimate without introducing bias, typically the state-value function.
← Back