Learning through Model Differentiation
Policy Gradient Through Model
Method that calculates policy gradients by propagating rewards through a differentiable environment model.
← BackMethod that calculates policy gradients by propagating rewards through a differentiable environment model.
← Back