KI-Glossar
Das vollständige Wörterbuch der Künstlichen Intelligenz
Classic Multi-armed Bandits
Fundamental problem where the agent chooses among several options to maximize cumulative reward.
Epsilon-Greedy Algorithms
Strategy that exploits the best known action with probability 1-ε and explores randomly with probability ε.
UCB Algorithms
Methods based on upper confidence bounds that balance exploration and exploitation through statistical intervals.
Thompson Sampling
Bayesian approach that samples parameters from their posterior distribution to make decisions.
Contextual Bandits
Extension where decisions depend on contextual features observed at each round.
Linear Bandits
Models where the expected reward is a linear function of contextual features.
Non-Stationary Bandits
Framework where reward distributions change over time, requiring continuous adaptation.
Combinatorial Bandits
Problems where the agent selects sets of actions simultaneously with structural constraints.
Adversarial Bandits
Scenario where an adversary chooses rewards to minimize the agent's gain.
Cascading Bandits
Model where items are presented sequentially until the user clicks on one of them.
Bandits with Limited Feedback
Situations where only partial information about the rewards is observed after each action.
Bandits for Online Advertising
Specific application for real-time advertising campaign optimization.
Bandits for A/B Testing
Smart alternative to traditional A/B testing for web experience optimization.
Bandits for Recommendations
Systems that learn user preferences to personalize recommendations.
Hierarchical Bandits
Multi-level structures where decisions are organized hierarchically for complex problems.