Multi-Objective Q-Learning
Reward Vector
Multidimensional reward vector where each component corresponds to the reward associated with a specific objective, replacing the traditional scalar reward signal.
← Zurück