Policy Gradient Methods
Advantage Function
Measure of the superiority of an action compared to the average of actions in a given state, calculated as the difference between the Q function and the V function to reduce gradient variance.
← Back