Implicit Q-Learning (IQL)
Implicit Advantage Function
Extension of IQL that estimates the relative advantages of actions without explicit maximization, enabling more robust action selection in offline contexts.
← Indietro