Implicit Q-Learning (IQL)
Implicit Q-Target Estimation
IQL mechanism that calculates target values without explicit maximization, using conditional expectations based on the behavior distribution.
← Geri