Dyna-Q Learning
Dyna-Q+
Extension of Dyna-Q incorporating an exploration mechanism based on the time elapsed since the last state-action pair visit to detect and adapt to environmental changes.
← 뒤로