Batch Constrained Q-learning (BCQ)
Uncertainty Estimation
Quantification of uncertainty associated with value estimates of actions not observed in the batch. Accurate uncertainty estimation allows penalizing out-of-distribution actions and improves robustness.
← 뒤로