Reinforcement Learning for Optimization
Cumulative Reward
Sum of expected future rewards that the agent seeks to maximize, often calculated with a discount factor to give less weight to distant rewards.
← Back