Implicit Q-Learning (IQL)
Batch-Constrained Optimization
Strategy in IQL that constrains learned actions to remain close to those observed in the dataset to avoid unreliable extrapolations.
← Indietro