Batch Constrained Q-learning (BCQ)
Behavior Cloning
Supervised learning technique that directly imitates expert actions from demonstration data without using reward signals. Although simple, this approach can suffer from cascading error accumulation during deployment.
← Geri