Dyna-Q Learning
Reward model
Learned function that predicts the expected reward for each state-action pair in a reinforcement learning environment.
← 뒤로Learned function that predicts the expected reward for each state-action pair in a reinforcement learning environment.
← 뒤로