Reinforcement Learning for Optimization
Policy Optimization
Class of methods in reinforcement learning that directly optimize the policy without going through a value function, often using policy gradient techniques.
← Zurück