Policy Gradient Methods
Natural Policy Gradient
Variant of policy gradient using the Fisher metric to perform parameterization-invariant updates, ensuring more stable and efficient convergence.
← Tillbaka