Dyna-Q Learning
Algorithm convergence
Property guaranteeing that Dyna-Q's value estimates converge to the optimal values under certain conditions of an exact model and infinite visits.
← GeriProperty guaranteeing that Dyna-Q's value estimates converge to the optimal values under certain conditions of an exact model and infinite visits.
← Geri