Reinforcement Learning for Optimization
SARSA Algorithm
On-policy reinforcement learning algorithm that updates Q-values based on the State-Action-Reward-State-Action sequence, unlike Q-learning.
← Back