Implicit Q-Learning (IQL)
Behavior Regularization
Mechanism in IQL that penalizes significant deviations from the behavior distribution to maintain stability and avoid risky actions.
← 뒤로Mechanism in IQL that penalizes significant deviations from the behavior distribution to maintain stability and avoid risky actions.
← 뒤로