DAgger Data Aggregation
Target policy
Optimal policy that the agent seeks to imitate, typically represented by expert demonstrations. The goal of DAgger is to make the learned policy converge toward this target policy.
← Indietro