Trust Region Policy Optimization (TRPO)
Fisher Information Matrix
Matrix that measures the amount of information a random observable carries about an unknown parameter, used in TRPO to define the geometry of parameter space.
← Geri