Batch Constrained Q-learning (BCQ)
Perturbation Model
Component of BCQ that generates variations around behavior actions to locally explore the action space. This model adds controlled noise to observed actions while ensuring their feasibility.
← 뒤로