Reinforcement Learning for Optimization
Actor-Critic Algorithm
Architecture combining an actor that selects actions according to a policy and a critic that evaluates these actions, enabling more stable and efficient learning.
← Terug