Learning through Model Differentiation
Policy Gradient Through Model
Method that calculates policy gradients by propagating rewards through a differentiable environment model.
← KembaliMethod that calculates policy gradients by propagating rewards through a differentiable environment model.
← Kembali