Neural Networks for Combinatorial Optimization
Policy Gradient Descent for Optimization
Reinforcement learning algorithm directly optimizing policy parameters to maximize expected reward in combinatorial optimization problems.
← Zurück