Dyna-Q Learning
Reward model
Learned function that predicts the expected reward for each state-action pair in a reinforcement learning environment.
← GeriLearned function that predicts the expected reward for each state-action pair in a reinforcement learning environment.
← Geri