Contextual Bandits
Contextual Bandit
Reinforcement learning algorithm that dynamically selects the best actions based on the observed context to maximize cumulative rewards.
← 뒤로Reinforcement learning algorithm that dynamically selects the best actions based on the observed context to maximize cumulative rewards.
← 뒤로