Contextual Bandits
Off-policy Evaluation
Evaluation technique that estimates the performance of a new policy using data collected by an existing policy, without requiring direct deployment.
← 뒤로Evaluation technique that estimates the performance of a new policy using data collected by an existing policy, without requiring direct deployment.
← 뒤로